. Vol. 40, Parts 1 and 2 June 1958 


BIOMETRIKA 


FOUNDED BY 


W. F. R. WELDON, FRANCIS GALTON ann KARL PEARSON 


MANAGING EDITOR 


E. S. PEARSON 


ASSOCIATE EDITORS 


M. G. KENDALL JOHN WISHART 
in consultation with 
HARALD CRAMER R. C. GEARY 
J. B. S. HALDANE D. G. KENDALL 


ISSUED BY 
THE BIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON 


PRINTED AT THE UNIVERSITY PRESS, CAMBRIDGE 


Reprinted by offset:litho 1964 





[Tesued 4 June 1953] 








oc 
TI 
th 


ar _— ail te 
ee ee. i. 2 i a ~~ a + OO Sr Om a BE 








ee 


hen fe iS 








VoLuME 40, Parts 1 anp 2 JUNE 1953 












THE SUPERPOSITION OF SEVERAL STRICTLY PERIODIC 
SEQUENCES OF EVENTS 


By D. R. COX anp W. L. SMITH 
Statistical Laboratory, University of Cambridge 


1. INTRODUCTION 


Suppose that there are a number of sources at each of which events occur from time to time. 
Suppose further that the outputs of the various sources are combined into one pooled output. 
We shall consider the problem of deducing the statistical properties of the pooled output 
from the statistical properties of the separate outputs. By a statistical property we mean, 
for example, the frequency distribution of the time interval between successive events. 
The problem is illustrated for three sources in Fig. 1. The events are marked by short 
vertical lines and the outputs of the sources S,, S, and S, are combined into a single pooled 
output P. 


Source 1: S; 





Source 2: S) 








Source 3: S; 


ee 
a ae me me me me me ee me ee 


a a oe ae ie ae ee oe oe ie 
Ser 
ae ae ae ae me maf me one oe 


Pooled 
output: P 





Fig. 1. The pooling of outputs. 


If we denote by ¢{ the time at which the jth event occurs on the ith source, we can state 
the problem formally as follows. Given N increasing sequences S; = (#0, {, ...] (i = 1,...,.N), 
let the ¢? be rearranged into a single increasing sequence P. The statistical properties of 
the S; are assumed known and we wish to determine various properties of the pooled 
output P. 

There are two cases of special interest. First, the intervals between successive events on 
any one source may be independent identically distributed random variables, the random 
variables associated with different sources also being independent. This situation is closely 
connected with renewal theory and will be discussed in a separate paper (Cox & Smith, 
1953). The second case, to which the present paper is devoted, arises when the events on 
any one source occur at exactly regular intervals so that the sequence S; is [6,,26;, ...], 
where 0; is the period of the ith source. 

At first sight this is not a statistical problem at all, because the times at which events 
occur are completely determined and no probability or chance mechanism enters the system. 
That statistical concepts are in fact relevant follows from a well-known theorem in the 
theory of numbers due to H. Wey]. This states, in effect, that if the 0, are mutually irrational, 


Biometrika 40 I 






UNIVERSITY OF UTAH LIBRARIES 











2 Superposition of several strictly periodic sequences of events 


the sources are random with respect to one another if observed after a sufficiently long time. 
We make this remark precise in § 2. 

Two applications, one in experimental psychology and the other in neuro-physiology, 
will now be described. 

Conrad (1951) has investigated behaviour in situations calling for constant vigilance and 
for adaptive reactions to a changing pattern of events. He used a number of dials, on each 
of which a pointer revolved at constant speed, different for different dials. Every time a 
pointer passed a mark on its dial the operator was required to turn a knob. Conrad found 
how the number of responses that the operator omitted to make and the error of timing 
depended on the number and speed of the dials. Here the sequence of events on any one 
dial is periodic, and the pooled output is the sequence of instants at which response is called 
for from the operator. The behaviour of the operator is likely to depend both on the mean 
number of events per minute and on the number of times events occur very close together. 
It is therefore important to know the frequency distribution of the interval between 
successive events in the pooled output.* 

One neurological application arose in the recent work by Fatt & Katz (1952) on spon- 
taneous subthreshold activity at motor-nerve endings. They found that in certain many- 
branching nerve endings there are a large number of active spots each passing electrical 
disturbances into muscle fibre. Fatt & Katz showed that the frequency distribution of 
the time interval between successive disturbances in the muscle fibre agreed with the 
theoretical exponential distribution characteristic of a random series. However, they 
explicitly pointed out that the succession of pulses from a single spot might nevertheless 
be periodic. The statistical problem here is to infer as much as possible about the individual 
sources (i.e. the single active spots) from observations on the pooled output. Another 
application is obtained when the sources are neurons and the events are discrete nerve pulses 
sent to some central nerve cell. This possibility is relevant to the theory of neural nets 
developed by Householder, Landahl and others (Householder & Landahl, 1945). 


2. FREQUENCY DISTRIBUTION OF INTERVALS IN THE POOLED OUTPUT 


In this section we show how Weyl’s theorem (Weyl, 1916)} can be used to derive the fre- 
quency distribution of the interval between successive events in the pooled output. It 
will be assumed that the periods 0; are positive numbers, and are mutually irrational in the 
sense that there exists no set of positive or negative integers n,, not all zero, such that 


N 
X10; = 0. (1) 


In particular, the 0, are all different and 6;/0; is irrational for all i +7. We suppose that the 
sources are numbered so that 0x is the smallest of the 6;. 

The simple form of Weyl’s theorem states that if « is irrational, and if {x} denotes the 
fractional part of x, then the sequence [{n@}] (n = 1,2,...) is uniformly distributed over 
(0,1). More precisely, if J, is any interval of length / in (0, 1) and if p,,(J) is the proportion 
of {0}, {26}, ..., {m6} falling in f, then 

lim Pyp»(4) = 1. (2) 


m—>o 


* We are vrateful to Mr Conrad for suggesting the investigation of this distribution. 
+ This, though not the earliest or most accessible reference, contains a particularly elegant proof of 
the theorem both in its simple and in its generalized form. 

















D. R. Cox anp W. L. Smrru 3 


In fact, all the frequency distributions considered in this paper are defined in an analogous 
way by limiting frequencies in a precisely defined infinite sequence. The successive members 
of the sequence are in no sense independent or random, and hence we have refrained from 
using the term probability. 

To reformulate (2) in terms of sources, consider any two sources, say the first two. 
Associate with the rth event on the first source a quantity x, equal to the time between that 
event and the immediately preceding event on the second source (see Fig. 2). We have 


x, = O{70,/05}. (3) 


It follows from Weyl’s theorem that, since 0,/0, is irrational, the ie mane [z,] is uniformly 
distributed over (0, 6,). 


(r—1) % 1 (r+-1)0, 








Moca dee dese a 


pop 





agen “eee 


Xr4t 


Fig. 2. Definition of z,. 


The generalized form of Weyl’s theorem states that if a,,...,a, are irrational numbers, 
themselves mutually irrational in the sense of (1), then the sequences [{na,}], ..., [{na,}] are 
independently uniformly distributed over (0, 1). More precisely, if I, is any portion of volume 
v of the k-dimensional unit cube, and if p,,(J,) is the proportion of the m k-plets 


{os}, ..-5 {Og}; ...; {may}, ..., {mcr} 
in J,, then lim p,,(J,) = v. (4) 


To apply this to our problem we define for the rth event on, say, the ith source a set of 
(N —1) quantities 2, ...,2¢-), a+», ..., 2 analogous to z, in (3). For example, 2® is the 
interval between the rth event on the ith source and the immediately preceding event on 
the lth source. Then (4) shows that the sequences [2], ..., [a—], [a@+], ..., [a] are inde- 
pendently uniformly distributed over (0, ,), ..., (0, @y). 

We can now calculate the frequency distribution of the interval between successive 
events in the pooled output. For let ¢;(y) be the frequency function of the length of intervals 
ending with an event from source i. The interval is between (y, y + dy) if one source, say the 
jth, has its x-value between (y, y + dy) and if all the other sources have their z-values greater 
than y. The frequency of the first of these events is dy/@;, and of the second ll : (0;,,—y)/6,. 


Thus, by the independence of the z-values, 


ie O.- 2 
ay) = Sz AY (o<y<oy). (5) 
i jk+i,j k 
Also when it = N there is a point concentration Q, of frequency at y = Oy given by 
=1(6,—0 
Qy = ( ae Se wy) (6) 
k=1 k 


If i+ N there is no point frequency. 


I-2 














+ Superposition of several strictly periodic sequences of events 


The overall frequency distribution is defined by a frequency function q(y), and by a 
point frequency Q, and is obtained by taking a weighted average of the q,(y). In a long time 
the number of events from the ith source is proportional to 671, so that 


N N 
gy) = » O;*4:(y) / = O51 


s Bee (9.—-y) 
— 19 fiers 











— (0<y<dy), (7) 
tle (9, —9y) 
0; k=1 0 
and Q=- 5 = . (8) 
i-19; 


We shall write (7) and (8) in two forms, one suitable for exact computation when N is 
small and the other suitable for deriving asymptotic forms when N is large. 
To obtain an expression for computation, we introduce the symmetric functions 


OM = 0,+...+0y, | 





oa) = XD 49:54,» 
is>ty q (9) 
a) = >} 4; eee 6:,, 
4y>...>t, 
oY) = 0,... 0x. j 
l N 
Then Wy) => XD Tl (e—y) 
N-1i=1j+1 k+i,j 
1 N-2 
=—p LY (-1r (r+ 1) (r+2)yos_, (O<y<Ay), (10) 
On-1 r=0 
N-1 
J (0, —9x) 
and ‘ = ——_____—- ° ll 
Oe (11) 


The coefficient of y’ in (10) is a sum of products of different 6’s with (r + 2) missing, and hence 
by symmetry must be a multiple of o{).,_,. The numerical coefficient is obtained by con- 
sidering the number of ways of permuting the suffices 7, 7, k. 

Equations (10) and (11) define the frequency distribution of intervals on the pooled 
output. The distribution consists of a discrete frequency, Q, at the smallest period 0,, and 
a continuous frequency curve between 0 and 6, defined by a polynomial of degree (N — 2). 
When N is small (10) and (11) are easily computed numerically; some examples are given 
in §5. The symmetric functions, oY), can be found systematically by a method of undeter- 
mined multipliers. 

The mean of the distribution (10), (11) can be found from first principles. In a very long 
time T there are asymptotically 76; 1 events on the ith source, and hence asymptotically 
TX6; | events altogether. Therefore 


N -1 
By) -[ = 07+ (12) 








D. R. Cox ann W. L. Smita 5 


The method used to obtain the frequency distribution of the interval between events in 


the pooled output can be extended to give the joint frequency distribution of successive 
intervals. 
3. ASYMPTOTIC PROPERTIES 


In this section we first show that if the 0; satisfy some weak conditions, the distribution (7) 
tends to the exponential distribution as N tends to infinity. 
Let us set x; = 671 and express y as a multiple of its mean by writing 


N 
r=y DX (13) 


The frequency function, r(z), of z is then from (7) 


sx) 
1-4 hy, 





dple) (2x)? 
_ Fee 
ae dz? > (14) 
N/ : 
where p(z) = II (1 - 3) A (15) 
i=1 
To discuss the limiting form of (15) we introduce the moments of the finite population 
Ror-+-rXay- Fite n= (2x,)/N, (16) 
He = (ZXD/N (8 = 2,3,...). 
Then log p(z) = 5 lo (1 - 3X) 
i=1 . =X 
gigs ee ee 
~~?" oN 3 Nw (17) 


To justify a limiting operation on (17) various assumptions about the x; are possible. 
They all have the effect of ensuring that no small group of periods is comparable with the 
mean interval (12) as N tends to infinity. The strongest assumption is: 

(ia) As N tends to infinity the y; have both an upper bound and a non-zero lower bound. 

This implies: 

(ib) As N tends to infinity the dimensionless moments y;/u° are bounded. 

A weaker assumption still is: 

(ic) As N tends to infinity 4{/u* < A*N*-* for some A, 8 = 2, 3,.... 

Under any of (ia), (ib) or (ic) we can let N tend to infinity in (17) and obtain 


p(z)~e-, (18) 
and r(z) = fe ~e-, (19) 


The differentiation of (18) is justified because p(z) for all finite N and the limit function 
e~* are integral functions. If we retain the term in N-!, we get under assumption (ia) or 
(ib) the second approximation 





_ (1+?) |. (20) 


r,(z) = e-* [2 IN 


where C? = y5/u2— 1 is the squared coefficient of variation of the x;. 











6 Superposition of several strictly periodic sequences of events 


The continuous frequency curve, r(z), has range 0 <z < xy! Zy,. We shall assume that the 
smallest period 0, is large compared with the mean interval, i.e. that 


N 
(ii) lim yy’ YX: = ©, (21) 
N->o i=] 


and it can then be shown that the point frequency (8) tends to zero. Thus the continuous 
frequency curve accounts asymptotically for the whole of the frequency. 

We have shown that under assumptions (i) and (ii) the frequency distribution of the 
interval between successive events on the pooled output tends to the exponential curve. 
It can be shown similarly that successive intervals tend to be distributed in independent 
exponential distributions. These results are but one aspect of what may be called the local 
randomness of the pooled output. As another illustration of local randomness we now prove 
that, if ¢ is small compared with all the periods 6,;, the number of events observed in time 
t follows a Poisson distribution of mean ¢(2x;). The frequency distribution of the number of 
events in time ¢ is defined by taking a large number of intervals of length ¢ spread uniformly 
over a very long time, and finding the proportion of intervals containing 0, 1, 2, ... events. 
More precisely, the intervals should be of the form (na,na+t) (n = 1,2,...), where & is 
mutually irrational with 6,, ..., Ay. 

Consider first a single source. A time interval ¢ <0; will include either no event or one 
event, and the proportion of intervals including one event is ¢/@;. Thus the generating 
function of the frequencies is 


(1—t/0;) + £t/0, = 1—(1—€) tx;. 


It follows from the independence of the sources that the generating function for the pooled 
output is 


N 
Qe) = TL 1-0-2), 


N 
so that log G(¢) = ¥ log[1—(1—¢) ty, ]. 
i=1 
If we assume that ty, is small for all i, 
N 
log G(¢)~ 2 —tx(1—¢), 


i.e. G(g)~exp[—1(1—) Ex], (22) 


the generating function of the Poisson distribution. 

The asymptotic equation (22) holds for times ¢ small compared with the individual 
periods. If N is large, t may nevertheless contain an appreciable number of events on the 
pooled output. If N is small ¢ must be smaller than, or of the order of, the mean interval 
between events and in this case the result is, in a sense, vacuous. 

The results of this section are special cases of the following intuitively obvious principle. 
Suppose that there are a number of sources and that we observe some property of the 
pooled output depending only on its behaviour over times small compared with the indi- 
vidual recurrence times. Then, except in degenerate cases, the result is indistinguishable 
from that for a random series, no matter what the form of the individual outputs. In 
particular if the number of sources and the individual recurrence times are large the 
output will be random over intervals containing many events. 








D. R. Cox anp W. L. Smiru 7 


4. THE VARIANCE-TIME CURVE 


The pooled output is distinguished from a random series by its behaviour over lengths of 
time comparable with the individual periods 0;. The most convenient way of expressing 
this behaviour is by the variance-time curve, V(t). This is defined as the variance of the 
number of events occurring in a time f. considered as a function of t. For a random series 


V(t) =At, (23) 


where A~! is the mean interval between successive events. The analogue of V(t) for con- 
tinuous stochastic processes has been extensively used, for example in studies of sampling 
and in work on irregularity in textile processing. 

To find V(t) for the pooled output of periodic sources, consider first a single source. Let 
xt = n;+f;, where n, is an integer and 0< #; <1, i.e. 


B; = {xit}- 


Then an interval of length ¢ contains either n; or n;+1 events from this source and the 
limiting frequency of intervals containing n; + 1 events is £;. The variance of the two-point 
distribution is £,(1—;). Since the different sources are, by Weyl’s theorem, independent, 
we have for N sources 


N 
V(t) = Alf). (24) 
More generally the generating function of the frequency distribution of the number of 
events in an interval ¢ is N : 
GO) = Te" B.+ Af. (25) 
The general behaviour of (24) can be seen as follows. If ¢ is very small compared with 
the 6,, 2 
si BA-f) ~8; = ty; 
so that V(t)~t2Xy, = ta, (26) 
where A-! is the mean interval between successive events. Further 
Bl — fi) < Xi 
so that V(t) <ta. (27) 
For ¢ large compared with 0;, B(1-B;) <tx;, 
so that V(t) <ta. 


Finally, as ¢ increases, £; takes each value between 0 and 1 equally often giving £;(1—;) 
an average value of 3. Thus for large t, V(t) oscillates about an average of $V. 

Tosum up, V(t) is tangential at ¢ = 0 to the straight line (23) representing a random series 
with the same mean number of events per unit time. V(¢) falls below (23) as soon as ¢ is 
comparable with an appreciable number of the periods 6,, and finally as ¢ tends to infinity 
V(t) oscillates about 1N. Some numerical examples are given in § 5. 

We have assumed above that N is finite and this is certainly true in the applications we 
have in mind. We now give a very brief discussion of the behaviour when N is infinite. If 
xy; is divergent the mean interval between successive events is zero so that time intervals 
can be found containing infinitely many events. If we exclude this possibility Xy; must 








8 Superposition of several strictly periodic sequences of events 


converge. By the argument for the finite case, V(t) oscillates for large ¢ about the curve 
4N(t), where N(t) is the number of periods less than ¢. As ¢ tends to infinity, N(t) tends to 
infinity but the rate of increase of N(t) tends to zero by the convergence of ¥;. 


5. NUMERICAL EXAMPLES 
Three special cases with N small have been investigated numerically. They are 
Example A: N = 4, Periods 1, 1-1317, 1-2781, 1-3486. 
Example B: N = 4, Periods 1, 1-1317, 3-2781, 6-8914. 


Example C: N = 10, Periods 1, 1-1327, 1-3819, 1-4662, 1-6173, 1-6593, 1-7814, 1-8181, 
2-2203, 2-6421. 

Fig. 3 shows the frequency distribution of the interval between successive events giving 

(a) the exact polynomial (10); 

(b) the exponential curve of the same mean; 

(c) the second approximation (20). 

Fig. 4 shows the variance-time curves computed from equation (25). The curve for 
Example C has approximately thirty cusps in the range shown; no attempt has been made 
to depict all these. 

The following conclusions may be drawn: 

(i) The second approximation (20) works well even when N is quite small. 
(ii) For a given number of periods, the distribution of intervals is closest to an expo- 
nential distribution when the periods 0; are nearly equal. 

(iii) The distribution of intervals tends rapidly to the exponential form as N increases, 
if the 0; are close together. 

(iv) The variance-time curves have a characteristic form entirely different from that 
for a random series. 

For computational convenience the @; could not be mutually irrational. We may either 
regard them as close approximations to irrationals or argue that since the lowest common 
multiple of the 0; is extremely large, the results derived on the assumption (1) are likely to 
be extremely good approximations to the behaviour for rational 0,. 


6. ANALYSIS OF DATA 


Suppose we are given a series of events suspected to be the pooled output of a number of 
periodic sources. What methods of analysis can be suggested ? 

If the number of sources is small and if the series available for analysis is long, it is possible 
in principle both to determine the 0; exactly and to assign each event to its appropriate 
source. To do this, first form the frequency distribution of the intervals between successive 
events. This will be bounded by a point concentration of frequency whose position will 
determine the smallest period 6,. Next find an interval of length 6, and from it build up 
the output of the Nth source by repeated addition and subtraction of 0. Delete this set of 
events and then analyse the remaining events to find the next smallest period, and so on. 
The method ceases to be a practical one as soon as the point frequency Q becomes very 
small, and therefore it is most unlikely to be useful in the neurological applications where 
the number of sources may be one hundred or more. For this we need less direct methods 
which we now consider very briefly. 







































































*[BAIOJUI UBOU OUIGS YIM SOLIOs WIOpuBr ————— {seomos orporsed pojood *SOAINO OUII}-OOUBLIVA “F “BIT 
7) out out awit 
v € t L 0 8 9 v t 
0 0 ¥ € t L 0 
T I T 4 Cree es - = T — 0 
H7-0 dizo 
v0 rr-0 
rO-L 
090 
QA 90 (A 190 (a)A 
! 
= | 8-0 te 8-0 
5 | 
| 
M 1702 @ ojduresg OL yy ajdwex3 Files 
a > ejdwex3 ! | ! " 
‘ 1 ! 
e ‘uoreurtxoidde puooes ——-——— feAmno [eIyueuodxe ***** !uoIsserdxe yoexe *S}UGAO GAISSEDONS UGEM4Oq [BAIOZUI Jo UOTyNGIySTp AouoNberq *¢ “BIT 
< JeAsaqul jPasaauy jeasaau} 
A 90 SO 0 £0 %0 10 0, Ob 8-0 9-0 6-0 z0 9, 0 
38 ] I I T T T T 0 
= 4o1 
a 
oz 0-1 40-4 
SAUND 1DeX9 WO 0-€ 
ajqeysinguiasipul al 
uojewixoudde puorsas - gr6s-0 = 020-1 2 oid 3 x 
, Anuanba.y aujod 39ex3 Se i $900-0 = O:0-1 28 mf 
ae c A A2uanbaay aujod 39ex3 me. & 
sOLXS-bSO'O4 2 : a % a 
Anuanba.y yulog A < 5 7 < 
\je a * ce 
Na ia g y ajdwex3 5 a 
> ajdwex3 alo F 8 
! Poe ae x ¢ Oo Se ooonm as > 0 D2 


7e 
tO 








10 Superposition of several strictly periodic sequences of events 


Any method based on the frequency distribution of intervals is clearly insensitive because 
the frequency curve is very nearly exponential except when N is small. A much better 
procedure is to use the variance-time curve. To determine this from a set of experimental 
data, the following method may be used. First divide the series into intervals of length 7 
such that only a small number of events occur in each interval. Next count the number of 
events occurring in each interval, so deriving a discrete time series [2,, ...,x,,], where 2; is 
the number of events in ({— 1)7<t<ir. Add the 2’s together in blocks of r, giving 


UY =2,+...+2, 


UP = Tet... + Bpys (28) 


mer pal 
Un—r+1 = Um—r41 +... +p. 


The U’s are the numbers of events occurring in intervals of length r7 so that an estimate 
of V(r7) may be formed from the corrected sum of squares of the U™’s. Put M =m-—r+1 
and take as the estimate 


A 3M 
al id te E- —3Mr+r—1 





] x corrected sum of squares of U™’s. (29) 


Do this for a number of values of r and plot V(rr) against rz. The divisor in (29) has been 
chosen so that if the series is random (i.e. if the x; are independent Poisson variables with 
constant mean 7A), then ” 
E{V(rr)] = Virr) = rtd, (30) 

provided that r < 4m. 

This analysis is closely connected with the serial correlation analysis of the x; (Yule, 1945). 
The variance of V( rT) can be shown by tedious but elementary methods to be, for a random 
series, 


hie (4r?7A + 3r + 27A) — 2r87A + 16r?7A — 24r7A + 167A — 7? +1) + (srs) 


ee 
3M? M3)° 


It was shown in §4 that if the series is the pooled output of N periodic sources, V(t) 
oscillates about §N for large t. This result enables us both to test for the existence of periods 


and to estimate N. It will be shown in another paper (Cox & Smith, 1953) that if the 
intervals between the events on any one source have coefficient of variation C, then 


V(t)~C?At for large t, (31) 


where A-1 is the mean interval between events on the pooled output. Both C and the number 
of sources can be estimated from the V(t) curve. 
Thus our method of using the variance-time curve distinguishes between: 


(a) a random series; 
(6) the pooled output of periodic sources; 


(c) the pooled output of sources on each of which the coefficient of variation of intervals 
is C+1. { 


The method has been applied to some neurological data. Full details will be given 
elsewhere. 





8) 


/(t) 
ods 
the 


31) 


ber 


vals 


ven 





D. R. Cox ann W. L. Smitru 11 


SUMMARY 


Suppose that there are N sources, that on the ith source events occur at times 0;, 20;,..., and 
that the outputs of the N sources are combined into a single pooled output. Statistical 
properties of the pooled output are investigated. Methods are suggested for distinguishing 
it from a random series and for estimating N from experimental data. Applications are 
indicated to experimental psychology and to neuro-physiology. 


We are grateful to Dr J. Wishart for helpful comments on the draft of the paper. 


REFERENCES 


ConraD, R. (1951). Brit. J. Industr. Med. 8, 1. 

Cox, D. R. & Smita, W. L. (1953). To be published. 

Fart, P. & Karz, B. (1952). J. Physiol. 117, 109. 

HovusEeHoitper, A. 8. & Lanpant, H. D. (1945). Mathematical Biophysics 
of the Central Nervous System. Bloomington: Principia Press. 

Wevt, H. (1916). Math. Ann. 77, 313. 

Yuteg, G. U. (1945). J. R. Statist. Soc. 108, 208. 











[ 12 ] 


APPROXIMATE CONFIDENCE INTERVALS 


By M. 8S. BARTLETT 
University of Manchester 


1. INTRODUCTION 


It is fairly generally known that asymptotic confidence intervals for an unknown parameter 
@ may be obtained directly from the likelihood (log) derivative 0L/00, where p=exp L is the 
probability of the sample (see, for example, M. G. Kendall, 1946, §19-10). Such intervals 
are asymptotically equivalent to those obtained from the maximum-likelihood estimate 


6, and have the property of providing asymptotically shortest confidence intervals on the 
average (see Kendall, § 19-12). The direct use of 0L/00 has the advantage that its mean and 
variance are (under the usual differentiation conditions, which will be assumed throughout 
up to any order required) known exactly as 0 and 


(OL\? eL 
1=8\(5) |~ *|-ae) 
in contrast with the asymptotic properties of 6, so that the only approximation introduced 
as far as the sampling distribution of 0L/00 is concerned is in regarding it as normal. This 
suggests that its use to provide confidence intervals should be approximately valid under 
fairly wide conditions, and, in so far as we can study the precise distribution of 0L/00, may 
even become exact. The further requirement for a valid confidence interval obtained from 
any random quantity 7'(@) is that 7'(@) bears a monotonic relation with @ for all samples, 
yielding a unique and admissible value 4, for each critical value J). 

Now with regard to the normality approximation it is quite feasible to investigate also 
the higher moments of 0/06, so that a correction to allow, say, for its skewness is not unduly 
complicated. It is the purpose of this note to develop such a further approximation pro- 
cedure. It would appear difficult to specify precise and useful conditions under which such 
an approximation, leading to the use of a quantity 7'(@), say, is necessarily free from possible 
violation of the further requirements on the relation of 7'(@) with 6, but one or two incidental 
comments on this point may be helpful. We may expect any correcting terms to have 
a relatively minor effect except perhaps for very small samples, and thus not to disturb any 
monotonic relation with @ in the neighbourhood of the critical values of 6. With regard to 
the intervals which would be obtained from 0L/00 if its exact distribution were made use of, 
it is noted that if a sufficient statistic for 0 exists the procedure is equivalent to making use 
of it,* but this does not necessarily imply that an exact confidence interval is possible even 
in such cases. For example, in small samples a confidence limit for 0, even when the required 
monotonic relation with @ is present, may yield a value outside the admissible range (this 
can happen also with the maximum likelihood estimate). In such cases the fact that we 
cannot always make full use of the sampling distribution means that our admissible con- 
fidence limits over all samples should have a confidence coefficient not less than the level 


* This naturally raises the question of how far 0L/00 may be regarded in general as a ‘theoretical 
sufficient statistic’ for 0 (cf. Bartlett, 1952, §6). However, this question is not altogether an un- 
ambiguous one, and I think that the answer is strictly only ‘yes’ if @L/00, as a random variable, is 
considered simultaneously for all possible @, and is ‘no’ if it is merely considered for the single true 
value of 6. 








M. 8. BARTLETT 13 


claimed. It might also be remarked that the approximation procedure is sometimes con- 
venient even if a sufficient statistic does exist, in cases-where the exact distribution of 
the sufficient statistic is difficult to handié (e.g. if the observations are from truncated 
distributions). 

Only the case of one unknown parameter 6 will be considered here, but it is hoped rm 
discuss the case of more than one unknown in a later paper. The extension to confidence 
regions is in principle straightforward, but the case of one unknown plus ‘nuisance para- 
meters’ is more subtle, being much more dependent on the nature of the sufficiency pro- 
perties of 0L/00 (see earlier footnote; cf. also Bartlett, 1936, §6). 


2. Moments oF 0L/20 


For reference a convenient method will be given of expressing the moments of 21/20 in 
their simplest terms for evaluation, as an extension of the familiar result 


A(z) -{-24 


It is desirable to introduce a more convenient notation, and we shall write 


OL oL 
cn, aan 
OL. OL. 
ap =p ape = aly 
OL - OL o?L) _ 
#\() =U, BlGg ape] = ata) 
2 
etc. Now 1= { p(0+7) = { »(6)expr a+ +.. J, 


where [ro is a formal notation for the probability Stieltjes integral over all possible 


samples. Expanding the last expression in powers of 7, and equating coefficients, we obtain 
an infinite series of relations of which the first four are 


I, = 0, 
L,+ LY? = 0, 1 


L, Pepin Shiuanycens -— 


The first two are the familiar ones already quoted, and the next two involve the third and 
fourth moments L{* and L{*. These are not yet in their simplest terms; thus it is possible 
to express L{® in ‘linear’ terms alone. We may, however, obtain an unlimited number of 
further relations by differentiating the relations (1). We have by differentiating the second 
and third relations once, noting that 


Le = 3, | v(0)(s5) = LP +2 Ly), 


etc., 


12, + LY + 2(L, L,) = 0, 
pL, + 3{(L{L,) + LS + (L, L,)} + {LY + 3(LPL,)} = 0. 











14 Approximate confidence intervals 

Differentiating the former of these two derived relations again, we obtain further 
aly + {Lf + 3(L{2Lq)} + 2{(L{?L,) + LY + (L,Ly)} = 0. 

We find finally, reverting to the full notation for clarity, 


“lea) =#( (eo) J= ®20* 22a) a 
“le0) =*{(@0) |-*" 
= oth 3 32 ap sol) a) 


where it will be noticed that «, involves one ‘non-linear’ quantity, the last variance term 
involving the mean square of 0?L,/06?. Whether these formulae (2) and (3) are useful will of 
course depend on the form of LZ; for example, in the case of the Cauchy distribution it is 
simpler to evaluate x, and x, directly. 


3. APPROXIMATE CORRECTION FOR THE SKEWNESS OF 0L/00 


It seems most useful to consider polynomial transformaiions of 0L/00 as in principle such 
transformations may be chosen with the aid of the above relations to annihilate the skewness 
(and, if necessary, any higher non-zero cumulants) exactly, although it is perhaps unlikely 
in practice that we shall wish to go further than a first further approximation involving «3. 
Consider the theoretical statistic 

oasg + (3) -Z]; (4) 


which still has zero mean, and variance 


22 (OL\* ; 
1+ 2x, +204 (57) . (5) 
Its skewness is K3+ ano*{(5) | + O(A?). (6) 
As o7{(0L/00)?} = «,+ 21, we therefore choose, to the first order of approximation, 
_ ks 
A=-5 48, (7) 
ua, ol lksf (eL\! 
whence T(0)=aq -33| (@) -1]. (8) 


The variance of 7’, if we neglect x3, is still J. More accurate confidence intervals should 
therefore be obtained if we write 7(6) = +pyI (0), (9) 


where / is the appropriate normal deviate for any required significance level, and solve for 0. 
Of course if x, = 0, the standard approximation remains unaltered to this order. In this 
particular case, i.e. when k, = 0, the correction for the next cumulant x, is also recorded. 


We write 
Ob oL\3 
1,0) = 55+ »(55) (10) 








3) 


of 


6) 


10) 





M. S. BarTLeTT 15 


with mean zero, variance o? = I + 2v[k,+ 317] + O(r’), (11) 
oL\® 
skewness E{T3} = 3vE| (55) | +008 
= 0+ O(v*, vs), (12) 
: oL\* 
and kurtosis E{T9} — 304 = ky+ 312+ vB (5) — 304 + O(v*), 
or, after reduction, Ky + 24vI3 + O(v?, vk4, VK). (13) 
1 ky 
Hence we choose Y=—o4a78 (14) 
_0OL 1k, (eL\$ Re 
with variance oy ~ 1(1—4x,/I?). (16) 


4, EXAMPLE I 


To test out this procedure, we shall try it out on a standard problem whose exact solution is 
well known, namely, confidence limits for a (normal) sample variance. This example is 
taken because the sample variance s? is quite skew for a moderate number of degrees of 
freedom, and there is some interest in seeing how far the method can cope with this skewness. 
We have 


oL n 

ae ~ aga —%» 

eb n ns* I n 
~ ae ~~ 368 OE” ~ 26 
and DF Sam, 3ns? (OL _ 2n 
Te Bt Nags) = 

cl n n 
36 = Ks = om 


the value of x, checking of course with the value inferred from the properties of s?. The 
standard approximation for the confidence interval would amount to solving 


gall) = +n (5), (17) 


where for convenience we have expressed confidence limits @ in terms of s*, i.e. have put 
s? = 1. The more accurate approximation suggested replaces (17) by the equation 


spil'-9)-55| (1-99-=— | = +0 /(F). (18) 


This as it stands no longer yields a simple quadratic equation for 0, but to the same order 
of approximation we may replace the square bracket by its value from (17), viz. 26°(4?—1)/n, 











16 Approximate confidence intervals 


and (18) then also reduces to a quadratic (i.e. to a linear equation for each sign of jx). A corre- 
sponding simplification is available in general, for the crude interval is obtained from 


~—-_ 


! 
aL 
= +p, (19) 
and substituting in 7'(0) in (9), we obtain | 
OL 1k, , \ 
967 -l)=+pVI (20) 


as the corresponding modification of the general equation (9). | 
In arithmetical work where 0L/00 is calculated for various values of 6, either (20) or (9) ) 
may be used. The original approximation (19) is most conveniently plotted as 


aL 
53/¥2 = +4, (21) 


for the values of 6 for which the left-hand side passes any chosen value of u can be read off 
from the same graph. Equation (9) similarly is written 





OL 1«,[ (eL 3 
zal *~e3'| (eo /¥4) -1] = a4 
and may be used in the same way. The equation equivalent to (20), viz. 
oL Ca 4e 
o/Vi-a nu =<He is (23) 


may-only be used at the one chosen value of w, but has the slight advantage that if inter- 
polation for @ (or some suitable function g(@)) is approximately linear in (21), it is likely to 
be so also for (23), rather than for (22). 











Table 1 
ener First or standard Second or further 
a approximation approximation 
n 
Lower | Lower} Upper | Upper | Lower | Lower| Upper | Upper | Lower | Lower |} Upper | Upper 
0:01 | 0-05 0-05 0-01 0-01 | 0-05 0-05 0-01 0-01 | 0-05 0-05 0-01 
5 | 0-331 | 0-452 | 4-37 9-03 | 0-405 | 0-490 co* co* 0-327 | 0-441 | 5-35 8-55 


10 | 0-431 | 0-546 2-54 3-91 0-490 | 0-576 3°78 oo* 0-428 | 0-541 2-65 3°94 
20 | 0-532 | 0-637 1-84 2-42 0-576 | 0-658 2-08 3-78 0-531 | 0-634 1-86 2-43 
30 | 0-589 | 0-685 1-62 2-01 0-625 | 0-702 1-74 2-50 0-589 | 0-684 1-63 2-01 















































* In this example the approximations break down at the upper limit for small n and P; this occurs 
with the first approximation even for some of the values in the table. 


To return to (17) and (18), the resulting limits are compared in Table 1 with the correct 
limits obtained from the distribution of s?, for upper and lower significance limits P = 0-05 
and 0-01, or total significance limits of 0-10 and 0-02, and representative values 5, 10, 20, 
and 30 for n. 


+ This further approximation has been used when obtaining the values given in Tables 1 and 2. 








rre- 


19) 


20) 


er- 
to 











M. S. BartTLettT 17 


It will be seen that even the second approximation is not too good at the upper limit for 
n = 5, and would be inferior to at least one other available approximation for the significance 
limits for s*, but the important point to remember is that it is a much more general method, 
and this example suggests that it should be a considerable improvement over the first 
approximation, which is still hardly satisfactory in this example for n as large as 30. 


5. EXxampte II 


As a second example we shall examine a standard discrete distribution, choosing for 
simplicity the Poisson rather than the binomial. Here we easily find, for 6 the unknown 
mean and x an observation, 


oL zz 1 
@ Gb Im% 
OL 2 1 
H553| =gr = Gp 
whence the crude approximation gives the equation 
AE Be 
5 l=+ eo (24) 
and the further approximation (corresponding to (20)) 
x 1 7 
ok weet View eee 25 
5 gpl = 2 (25) 


Of course for discrete distributions a confidence interval (at least if obtained from the data 
directly without introduction of extra randomization devices) can only be given as an 
inequality for the confidence level. This is perhaps hardly crucial if an approximate method 








Table 2 
ee ee First or standard Second or further 
ee approximation approximation 





Lower | Lower | Upper | Upper | Lower | Lower | Upper | Upper “ea, Lower | Upper | Upper 
0-01 | 0-05 | 0-05 | 0-01 | 0-01 | 0-05 | 0-05 | 0-01 | 0-01 | 0-05 | 0-05 | 0-01 





i 


o| oO 0 3-00 | 461] + + 3-64 | 6-37 + 3-12 | 4-93 
1| 0-01 | 0:05 {| 4:74| 664] 0-04] 0-07] 5-28] 8-14 -| 0-02] 4:83 | 6-86 
2| 015 | 036 | 630| 8-41] 0-28| 0-43| 679| 9-77 | 0-09| 031] 637 | 858 
3| 0-44 | 082] 7-75 | 10-05 | 0-64] 0-92] 8-22] 11-33} 0-36] 0-77] 7-81 | 10-19 

5 | 1-28 | 1-97 | 10-51 | 13-11] 1-58} 211 | 10-94 | 14-30] 1-21 | 1-93 | 10-56 | 13-22 
10} 4:13 | 5-43 | 16-96 | 20-14 | 4:54 | 5-61 | 17-35 | 21-21 | 4-07 | 5-40 | 17-00 | 20-23 
20 | 11-08 | 13-25 | 29-06 | 33-10 | 11-58 | 13-46 | 29-42 | 34-08 | 11-04 | 13-23 | 29-09 | 33-16 
30 | 18-74 | 21-59 | 40-69 | 45-40 | 19-29 | 21-82 | 41-04 | 46-33 | 18-71 | 21-58 | 40-71 | 45-45 


++ 


















































* Quoted from Garwood (1936). 

{ The first approximation gives 9 = 0 here only if the continuity correction is not used. The second 
approximation no longer gives near 0 = 0 a monotonic relation between 7(@) and @ for small xz, and 
thus breaks down near 6 = 0. (The inadequacy of the second approximation for very small @ is hardly 
unexpected, for the Poisson mean @ in this example effectively takes the place of n, the size of the 
sample.) 


Biometrika 40 2 











18 Approximate confidence intervals 


is in any case being used, but it reminds us that a still further approximation is now involved 
in obtaining limits from normal theory. For a purely discrete distribution as in this example 
some gain would be expected from the introduction of the usual continuity correction, which 
implies using x — 4 when obtaining the lower limit for 0 and x +4 for the upper limit. This 
correction has therefore been used throughout, though from an examination of one or two 
values the consequent gain for the first crude approximation appeared much more doubtful 
than for the second. 

It will be seen from Table 2 that the second approximation is a considerable improvement 
over the first (except near 0 = 0), and is in fact somewhat better for small x at the upper 
limit than we might have anticipated from the experience with the first example. The exact 
values quoted were readily obtained by Garwood from  y? significance limits, and, as in 
Example I, it must be emphasized that the purpose of examining the present approximation 
in these standard examples is merely to obtain an idea of its accuracy. 


6. EXAMPLE III 


The problem which actually initiated this inquiry is being discussed in detail elsewhere, but 
it may be helpful to indicate it here. From the tracks of certain cosmic ray particles which 
‘decay’ into other fundamental! particles, it was required to estimate the ‘decay’ parameter 
0 in the lifetime distribution flt)dt = e-¥ at/a. (26) 


This is simple enough in the case of unlimited track length, but for tracks in a chamber of 
finite size, not all particles will decay; the chance of doing so moreover varies with each 
particle because its time of passage through the chamber depends on its momentum. The 
situation is not always the same as this and is often complicated by further observational 
difficulties, but for simplicity here we shall suppose we have just N particles of one type, with 
effective times 7, (s = 1... N) in the chamber, of which n decay at times t,(r = 1... ”). The 
likelihood function in this case is a mixture of finite probability factors and densities, but 
this causes no difficulty. For such data we have, as the probability of the sth undecayed par- 
ticle is Q, = e~7#®, and the probability density of the rth decayed particle is f(t,) = e—”/0, 


OL n *% +4 ae 


Ie Ja ee ey 
giving in this particular case the immediate estimate 
A l n N 
6-=(E4+ > 7). (28) 
1 \r=1 s=n+1 


(The record of undecayeu particles is not always available; in that case the estimation 


equation is different and not quite such a simple solution 6 is possible from the combined 
data.) It is not difficult to show that for equation (27) 


=> 4, (29) 


where P, = 1—Q,. Thus the first approximation for the confidence interval would be 
obtained from equation (19), with 6L/00 given now by (27) and I by (29). In the limiting 
case when all 7, become large, the problem reduces to Example I, for each measurement 
t from the exponential distribution (26) is equivalent to a y? with two degrees of freedom, 
and the mean ¢ would be equivalent to a y? (or s?) with 2N degrees of freedom. In practice 




















». §. BartLettr 19 


the T, are often small, but in any case, in any attempt to obtain a confidence interval for 
0, it would seem advisable when possible to go to the further approximation (9) or (20). 
It is not too difficult, at least in the case covered by equation (27), to obtain the exact 
moment-generating function M(¢) of 6? 0L/00, remembering that the chance of decay some- 
where in the chamber for each particle is P,(s = 1... N). The answer comes out to be 


N 
M($) = TI M9), with 

M,(p) = e~# [1 —Q, e?s {1 — (1 — 69) €°}]/(1 — 69). (30) 
Expanding log M(¢), we find for the cumulants of 6201/00 


N 


s=1 


N N 
Ks = 28 > P,-38 > Q,T,, p (31) 
s=1 s=1 





ee N N N 
ky = 6NO*— 86° ¥; Q,7,- 66? ¥, T2Q, +0" ¥ Q,-30" QE. 
s=1 s=1 s=1 s=1 4 


However, as the distribution of 0L/00 is a mixture of continuous and discrete components 
(for there is a finite chance that none of the N particles decays), it would be difficult to make 
use of the exact distribution of 0L/06 in this particular case, and the use of the cumuilants 
(31), while first obtained directly by the above method, would be equivalent to using the 
general method developed earlier in this paper. The formulae in (31) agree of course with 
the results derived via the general method, and the expressions for x, and x, were checked 
by this means. 

For maximum-likelihood or confidence interval equations not directly soluble, iterative 
or interpolative methods may usually be used (see equations (21), (22) and (23), §4). 


I am indebted to Mrs A. Linnert for assistance with the calculations for Tables 1 and 2. 


REFERENCES 


Barrett, M. 8. (1936). Statistical information and properties of sufficiency. Proc. Roy. Soc. A, 154, 
124, 

Barttett, M. 8S. (1952). The statistical significance of odd bits of information. Biometrika, 39, 228. 

Garwoop, F. (1936). Fiducial limits for the Poisson distribution. Biometrika, 28, 437. 

Kenpat., M. G. (1946). The Advanced Theory of Statistics, 1st ed. 2. London: Griffin. 


Notes added in proof. (i) The coefficients in the polynomial transformations considered 
in §3 are equivalent to those in a series expansion given by E. A. Cornish & R. A. 
Fisher in §8 of their paper ‘Moments and cumulants in the specification of distributions,’ 
Rev. Inst. Int. Statis. (1937), 4, which should be consulted if further terms of the 
expansion are required. 

(ii) The problem referred to in §6 I have discussed further in my paper ‘On the 
statistical estimation of mean life-time,’ Phil. Mag. (7th series), 1953, 44 (in the Press). 











[ 20 ] 


INCOMPLETE AND ABSOLUTE MOMENTS OF THE MULTIVARIATE 
NORMAL DISTRIBUTION WITH SOME APPLICATIONS 


By A. R. KAMAT, 
University College, London 


1. INTRODUCTION 


It is a well-known property of normally distributed error variables 2z,,x2,23,... that the 
process of totalling or averaging the error produces a normal distribution. This property 
also, of course, holds if the errors are correlated ‘normally’. Occasions arise, however, when 
it is necessary to consider the total numerical error T7' = |z,|+|2z.|+|23|+... (or its 
average). Tricomi (1936) has investigated the statistic 7’ as a measure of dispersion when 
the variables are uncorrelated and are of equal variance. Arley & Hald (1950) have proposed 


n—1 
the statistic > | x;,—2;,,,|/(m—1) as a measure of dispersion.* 
i=1 


In certain technological applications also it is found that individual errors compound into 
a total error obtained by the addition of numerical values of individual errors. We may 
mention the following example: In the manufacture of metal rods it is found that the lengths 
of individual rods differ from specification by a normal error variable. Pairs of rods with 
lengths x, and x, are assembled in a manner requiring the longer of the pair to be filed down 
to the length of the shorter one. If the cross-section of the rods is constant, the total loss of 
weight through filing is proportional to & | x,—2,|, which is a sum of independent mod- 
normal variates. Occasions also arise when special mechanisms give rise to a total error 
which is a sum of mod-normal errors or a sum of normal and mod-normal errors, although 
in such situations there are only two or three error components. For instance, in the 
manufacture of screw gauges two main errors arise, an error x in the diameter of the gauge 
and an error y in its pitch. These two result in a total vertical displacement of the points of 
contact of screw and gauge which is proportional to x + |y|, where c depends on the pitch 
angle. 

While dealing with some special cases of correlated error variables, the present author 
was concerned with the evaluation of the moments of 7' and was thereby naturally led to the 
calculation of the ‘absolute moments’ of the multivariate normal distribution, namely, 


(l,m, n, w+) ne OE | xy |" | &_|™ [arg |"... p(x) da, (1) 


where p(x) is the probability-density function of x,,2,...,7,, these variates having zero 
means and given variances and covariances. Although we shall not be able to deal with 
them in this paper, certain other statistics require for their study the knowledge of the 
contribution to these absolute moments separately from each quadrant, i.e. integrals of the 
type 


2 


(l,m, n,...] = { [etapag... pede. (2) 


* A detailed investigation of this statistic is the subject of a further paper by the present author 
printed on pp. 116-27 below. 





E 





A. R. Kamat 21 


It is clear that the contributions from different quadrants can be obtained from (2) by 
changing the signs of the appropriate elements of the correlation matrix and that (1) can be 
found as the sum of all these contributions. 

In §2 of the paper we shall deal with the evaluation of [/, m,n] for the bivariate and the 
trivariate cases and hence with that of (/, m,n). Since the work was put in hand, papers have 
been published by Nabeya (1951, 1952) giving the absolute moments for the bivariate and 
the trivariate cases and a generalization in the form of a multiple integral for the multi- 
variate case.* Nabeya’s method is not, however, applicable to the evaluation of [1, m,n], 
and the methods developed in the present paper are also believed to be simpler for evaluating 
certain of the absolute moments (1, m,n). In §3, [1, m,n, ...] and (1, m,n, ...) are obtained for 
a multivariate distribution in the form of power series of correlation coefficients. Finally, 
§4 is concerned with the application of the results to the distribution of 7 and related 
statistics. 

2. DERIVATION OF THE ABSOLUTE MOMENTS 
In the following, without loss of generality, the variates are measured in terms of their 
standard deviations, so that the n-variate normal distribution is given by — 


1 
(2, %q, ...,L_) = (277)-*" wt exp “ Bua} (r,a = 1,2,...,%), 
where w is the determinant of the correlation matrix, w,, are its cofactors, and the quadratic 
form Xw,,%,x, is positive definite. 
2-1. Univariate case 


For the univariate case the results are well known and are easily found by direct 
integration: cat 
[r] = 4(r) = (2m)-* ‘ af e-t"da = rian ). (3) 


The following representation is given only to illustrate the method which will be followed 
in the bivariate and trivariate cases. Defining 


rr —az? if a = * Lanai WE; 4 
Ry= {"e a= 5 ff, R= [) ze dz = (a> 0) (4) 
it is easy to see that 
[r] = $(r) = 72 PF, (5) 
where Py, = [i-mS Fy , Pa =| (- ye | ’ (6) 
a=1 


so that the familiar results (3) are obtained. 


2-2. Bivariate case 
For the bivariate case 


[m,n] = [feree, y)dxdy, (m,n) = {fi a™||y" | p(x, y) dxdy, (7) 
0 a. ) 


where p(x, y) = (2m)? (1—p*)texp x rer (x?+y?— 2pxy)| , 


* We gratefully acknowledge the courtesy of Mr S. Nabeya for letting us see a manuscript of his 
second paper in advance of its publication. 











22 Absolute moments of the multivariate normal distribution 
the substitution x’ = [2(1—,*)]-*a, y’ = [2(1—p?)]-*y transforms (7) into 


[m,n] = 2-t-1[2(1 —p2)]kminen P, ; 
and (m, n) an 2ta-1[2(1 — p?)]Kmtat+h Fas + Gab ( ) 
where Pun = | femyrexp{— (08+ y*—2pey)} dedy, 

0 


and Q,,,, is obtained from P, ,, by changing p to —p. Defining three basic integrals 


Ruy = {ean deay, i, = (fetteanaedy, Ry = {i [uteydedy, 9) 
0 0 0 


wheref (x,y) = exp{— (W,,2? + Woy? + 2u,.2y)} and the quadratic form w,, 2? + Woy? + 204.xy 
is positive definite, it is easy to see that PF, ,, can be obtained as a higher derivative of Roy or 
Ryo or Ro, (as the case may be) with respect to the parameters ,,, W2, 2, with the sub- 
stitution of w,, = Wy. = 1 and w,, = —p after differentiation. Q,, ,, is obtained from P,, , by 
simply replacing p by —p. For instance, 


Pa=[(- Sa 





a 


PRio 
and Pa= [(-» Pons Reka. —_—S 


2-3. Bivariate case: Evaluation of the basic integrals 


It now remains to evaluate the basic integrals Roy, Ry and R,,. Changing to polar 
co-ordinates and using standard integration, we obtain for Ro, the two expressions which 
will be used in the following: 


dw’, 





Roo = $(011 Ggq— Ofg)* [$7 — sin} (W49(W1 gq)*)] 
= $41 99 — W}q)~* CoS" (04 ( 9) *). J 


The integrals R19, Ry; may be evaluated in the same way or by the following method which 
will also be used in the trivariate case. Clearly the transformation ./w,,% = x’, Woy = y’ 
transforms 


(10) 


Ry = oj)’ OR J,, (11) 
a m7, [[eexpt- (2? +y? + 2aay)}dady, & = (Wy W9)* 4. (12) 
0 
Similarly, Ry, = wot wR J, (13) 
where J, = [fy exp {— (a? + y? + 2azy)} dady. (14) 
0 
Now Je+ad, = || (x-+ay)exp{— (at +y*+ 2azy)}dedy, 
0 


and introducing 2’ = z+ ay, y’ = y we have 


J,+aJ, = | e~¥* dy’ = } Jn. (15) 





ar 
ch 


14) 


15) 





A. R. Kamat 23 
Similarly ad, +J, = }./7. (16) 
Solving (15) and (16) J, = J, =},/a(1+a)> 
and substituting for a, Ry) and Ry, are obtained as 
Ry = hy ome 


17 
Ro = $.)7 wt (wh, oh, + 0,5). m 


2-4. Trivariate case 
For the trivariate normal distribution the contribution from the positive quadrant is 
given by 
{l, m, n] = (27)-# ot {[[aaras exp = 55 Bn dz,dz,dx, (r,s=1,2,3). (18) 
0 


The transformation 2; = (2w)-? 2, carries (18) into 


[2, m,n] = 1-HQK+m+n)yki+m+nt2)P (19) 
where Pinan = | | [a ay ag exp {— Lw,,x,2,}dx,dx,dxz. (20) 
0 


In the same manner we can show that the absolute moment 
(1, m, m) = 1-¥(2w) im +49 (PM, + PP nn t+ PP nn t+ Pon, nls 


where Pin,» (i = 1, 2, 3, 4) are obtained from (20) with the signs of ( + w,,), 7s in each Pf, , 
taken in accordance with the sign-pattern shown below: 


t Wy, Wag Wg 

1+ + + 

24+ - - (21) 
3 o- + - 

4- - + 


As in the bivariate case, it can be shown that F, ,,,, can be obtained by differentiations 
with respect to w,, from the four basic integrals defined as follows: 


Rooo = i F (1, %q,%3)dx,dx,d%5, Ry = fff Lf (x1, Xe, Xz) dx, dx,dzz, | 
; ; | (22 





Row = [[fes (%1,%~q,%3)dx,dx_dxz, Ry, = [fear (X1,%g,%3)dx,dx,dz3, 
0 0 ] 


where f(&1,%q,%3) = exp{—ZXw,,%,2,} (r,s = 1, 2,3). 


The four results P%’,, , (i = 1, 2,3, 4) are obtained essentially by the same differentiations, 
the difference being merely in the insertion after differentiation of the appropriate signs of 
(+,,), 7+8 in accordance with (21). 








24 Absolute moments of the multivariate normal distribution 


2:5. Trivariate case: Evaluation of basic integrals R, and R, 
(a) Evaluation of Rog. 
The transformation 
Voy =, fWan%,=Y, [Wg3%y = 2 


gives Roop = (Wy, e953) 7* [[fex {— (a? +y? +2? + 2cxy + 2ayz + 2bzx)} dadydz, (23) 
0 


where @ = (Wy9Wg3)* Woz, 5 = (Wgg0,)-* 43, ¢ = (011 @gq)~* 049. (24) 


The transformation x = rsin@cos¢, y = rcos0, z = rsin@sin¢ transforms the integral on 
the right-hand side of (23) into 


a Si ie exp {—r*[1 + 2(asin ¢ + 6 cos g) sin 6 cos0 + 2csin?Osin ¢ cos ¢)}r?sin 0 drd0d¢d 
= tn {ag [” [1 + 2(asin ¢ + 6 cos ¢) sin 0 cos 6 + 2c sin? 6 sin ¢ cos ¢]-? sin 0d0. 

The substitution ¢ = tan @ gives, on integration, 
Rooo = £4) (W449 (¥gg)# | . [BA-1(B?— A)-1—(B®— A)-1] dé, (25) 


where A=1+2csin¢dcos¢, B=asin¢d+bcos¢. 


The second integral in (25) is evaluated at once; the first is integrated with the substitution 
tang+c = tan¢’. After some simplification, we obtain 














Roo = £17 (41 W935) * a-{4n - > tan (“ a} , (26) 
a,b,e 
lec b 
bail 
W being the determinant 
yy, %2 3 
yg Weq Woz |- (28) 
Vig Weg U3 
Since a — be = wii} (Wo9W33)~* (W311 — ©4243), etc., we have finally 
Roop = wl 1 + ten-1( “ea at “2%u)) : 29 
wo = ty7Wdnt & J(Wo) (29) 
(b) Evaluation of Ryo, Ror, Roor- 
Employing the same transformation, 
Ryo = O74" (W233) * J, (30) 
where J, = | {fe exp {— (a? + y? +. 27+ 2cay + 2ayz + 2bzx)} dadydz, 
0 


a, b, c having the same meaning as in (24). Similarly 
10 = Wx (g304,)* J, (31) 
Roo. = 3g'(W11 22) * J, (32) 





3) 


4) 


on 


6) 


7) 


28) 


29) 


0) 


31) 
32) 





A. R. Kamat 25 


where J,, J, are defined in 1 manner similar to J,. The substitution x’ = x+cy+bz, y’ = y, 
z' = z gives 


J,+ cd, +bJ, = {fe exp { —[x’? + (1 —b?) y+ (1 —c?) 22 + 2(a—be) y’z'}} dx’ dy’ dz’ 
0 


= 5 | [exe {-—(y’?+2’2 + 2ay’z’)} dy’ dz' =I,, (33) 

0 
where, from (10), I, = }(1—a?)-* cos“ (a). (34) 
Similarly, cJ,+J,+aJ, = I,, (35) 
bJ,+ad,+J, = I, (36) 
where J,, J, are defined as in (27). Solving these linear equations (33), (35) and (36) we have 
J, = A-4{(1—a?) I, + (ab—c) I, + (ac—b) 1}, (37) 


where A is defined as in (34) above. Expressing a, 6, c in terms of w,,, we have 


Ryo = EWA (Wg — W3g)* COS! (99(Woog3)~*) 


W314 — Wyo 
+318 12 33 cos} (w43(41 33) ~*) + 


Wo3 12 — 13 99 
© (W311 — W3,)* 
3311 — Wig 


(@41 2 — W3,)* 


where W is defined as in (28). Royo, Roo, will have similar expressions. 





cos? (W49(@4, on) | > (38) 


2-6. Table of results 


We give below a list of [m, n] and (m, n) for the bivariate case and (1, m, n) for the trivariate 
case up to weight six. For a fuller list of (/,m,n) the reader is referred to Nabeya (1951, 
1952). Values of [/, m,n] for the trivariate case have not been given here because of their 
lengthy expressions. The following observations may be made: (1) All [m,n] and some 
[/,m,n] can be obtained by direct integration, but the procedure involved is very tedious. 
(2) When the differentiations of the R integrals concerned are with respect to w,, (r +8) only, 
simplification may be introduced by expressing them as differentiations with respect to 
a, b, c. (3) Absolute moments, when |, m, n are all even, are the same as ordinary moments 
of the normal distribution. (4) It is easily seen that (1,0, 0) = (/) and (l,m, 0) = (l,m). (5) We 
may also note that the moments about zero of the distributions p(z,,x7,) = ce-9@v™» and 
P(X1, Xq, Xz) = ce~“v%2 7s), where Q is a positive definite quadratic form and 2, x2, x, > 0, can 
be obtained from formulae essentially similar to [m,n] or [l, m,n]. 


Bivariate results for [l,m]: 


[0, 0] = 5-(dr+sin-tp), 
(1,0) = E+), 


[2,0] = = (ir tsin-tp +p/(1—p*)}, 








26 Absolute moments of the multivariate normal distribution 
[1,1] = 5 {eld + sin-tp) + (1—p)}, 
(3,0) = 5 [Za+er2-p) 
Bu=7 Fase, 
[4,0] = 5 {8(4a +sin-4p) + /(1—p%) (5p— 29%), 
[3,1] = 5 (Sp(dr+sinp) + 4(1—p%) (24.4), 
[2,2] = 5 {(1+2p%) (Jr +-sin-4p) + 3p y(1—p%}, 
(5,0) = 5 (1+) + 99%, 
(4-5 [Fa+er(a-p), 
(321-5 [220 +0y, 
[6, 0] = = (15 (4+ sin p) + (1 p*) (389 — 260° + 8p5)}, 
[5, 1] = = {15p(4 + sin-p) + (1 —p*) (8-+ 9p 2p), 
[4,2] = 5 {8(1 +4p%) (dar+ sin“ p) + y(1—p%)(13p + 2p), 
[3, 3] = 5 {89(3 + 2p% (a+ sin1p) + V(1—p%) (4+ 1ip%)}. 


Univariate, bivariate and trivariate absolute moments : 


(1,0, 0) - fF 


(2,0,0) = 1, 


2 ; 
(1,1,0) = 7 W(t — Pie) + P128in= p,9}, 


(3,0,0) = 2 /F, 


(21,0) = [2(1+ohy), 


2\ : 
(1,1,1)= (-) {wt + Z(po3 + P12P13) sin p95.3}, 
(4, 0,0) = 3, 


(3, 1,0) = = ‘(1 — Piz) (2 + pi) + 3p,2 sin p,,}, 











A. R. Kamat 27 
(2, 2, 0) =1 oF 2pie, 


2 : 
(2,1,1) = 7 (Pas + 2/3213) SiN Pog + 4/(1 — p9q) (1 + Pie + P%s)}; 


2 
(5,0,0) =8 /F, 


2 
(4,1,0) = [7 (8+ 69% — pt) 
2 
(3, 2,0) = 2 [1+ 3p) 


(3,1,1) = () {(2 + pis + pis) ot + (093 + 3p 1213) SiN frog, 
+ [3p43(1 + Piz) + P12P 23(3 — Pjg)] Sin Pis-2 
m + [3py2(1 + Pis) + P1sP23(3 — Pis)] Sin P23}, 
(2, 2,1) = di (1+ 2p}s + pis + P33 — PisP3s + 4P12P13P23), 
(6, 0,0) = 15, 


2 * 
(5, 1,0) = 3 {15p,_8in—! p,. + «/(1 — p32.) (8 + 9p2, — 2pi,)}, 
(4, 2,0) = 3(1 + 4p%,), 


2 ‘ 
(4,1,1) = > {3(p0s+ 401/13) SiN Pog + «/(1 — p33) (2 + 8p, + 8pis + pis 


w 
 4p.sPrsPs— hha + Aphpts+ 2), 
2 


2 q 
(3, 3,0) = - {3P12(3 + 2p7_) sin-! py» + (1 — pig) (4+ 11p?,)}, 


(3, 2,1) = = (prs + 2P32P23 + 2p}2/43) SiN p43 
+/(1— pis) (2+ 6pis + pis + 23s — 2p7s/3s + 6P12P13P23)}; 
(2, 2,2) = 1+ 2(p}, + pis + Ps) + 8P12/13P23- 
Note. The expression w, included in (1, 1, 1), (3, 1, 1) and (4, 1, 1), is the determinant of the 
correlation matrix, while the partial correlation coefficients included in (1, 1, 1) and (3, 1, 1) 


are defined as Pear = (Pas— Pr2P1s)/{y/(1 — pia) (1 —pis)}, ete. 


2-7. Generalization to the multivariate case 


The principle underlying the method given above may now be stated in a general form. 
The absolute moments of an n-variate normal distribution can be expressed in terms of the 
derivatives of n +1 basic integrals consisting of (i) Ry and (ii) n integrals of the type R,, of 
that order. The latter satisfy a set of n linear equations among themselves and integrals of 
the type R, of the (x — 1)th order and hence are determinate if the R, integral of the (n — 1)th 
order is known. For instance, since Rogy is known, we can find Ryo99, Rox99s Rooros Rooo1s Where 


Ripoo = {ff 2, exp{—Xw,,x,2,}dx,dx,dx,dx, (r,s = 1, 2,3, 4). 
0 








28 Absolute moments of the multivariate normal distribution 
Thus Rooo(%23;%sa»%oa), Ce, Cg, Ma 
Rooo(%13»%sa,%14), 1, yg, ag 
Rooo(%14»%24»%12), %eg, 1, Oye 
Rooo(%12%e3%13), Coq, Sy, 1 


where A=|ai;|, Oy = W,(W40;;)-* 


Ryooo = $11" (223344)? A ‘ (39) 


and Rooo(@, 6, c) is the three-dimensional integral 
{] { exp {— (a? + y? +2? + 2axy + 2byz + 2czx)} dadydz. 
0 


It seems, however, that the R, integral of every order has to be determined directly. Since 
the evaluation of this integral becomes difficult for n> 3, the present approach gives, in 
finite form, all absolute moments up to the trivariate normal distribution and also the 
separate contribution of every quadrant, and similar odd-weight results (1 +m +n+k odd) 
for the four-variate case. 


3. EXPRESSIONS FOR [l, m,n, ...] AND (1, m, n, ...) AS POWER-SERIES 


3:1. General theory illustrated in four-variate case 
Since the exact results of §2 are confined to the univariate, bivariate and trivariate cases 
we are giving here expansions for the absolute moment and the separate contribution of 
each quadrant in powers of correlation coefficients for the multivariate case. To avoid com- 
plicated notation, only the four-variate case is considered, but, as can be seen from the 
derivation, the method is quite general. Let 
p(x) = w-*(277)-* exp { — 420,,2,2,/0} (r,s = 1, 2,3, 4) 

bethe probability-density function. Then the characteristic function is exp {— 42¢? — Zp,,t,t,}. 
With a slight change in notation adopted above, the contribution of the positive quadrant 
to (1,,1,, 15, U4) is 


LLuis | i) | ah aw ay pc) de 
0 


. | | | a alg a ((2)-* | J J | exp{— 42 —Zp,tyt,—iZt,2,} dt) dx 
= (2ny-*{ J [[areeera| J j [[exp(-a2e- i212, 


bed 4 inf ict u 
” (sxxEEE (— 1)Ptarrte+teu PharPierb MP ae tptate pptrextgts+u ie) at] re 








4, p+aq+r f,, p+stt F,, q+s+u F,, r+t+u> 


(40) 
where F,;= : fra(f ti et —ite at dx. (41) 


27 


4 Pia P's PisP23P24 P34 
rs DUTVIT (— 1ptatrsesls piqirts!t!w! 





~— OO w~ O&O 





A. R. Kamat 29 


An exactly similar procedure gives the absolute moment 


(1,l2,03,14) = {ff | ay | | ary | | avg |’ | wy |* (ar) dx 





= 22 >>> SDE (— 1)Petatrtsttu PP2P ts Pia P23 Poa P34 
0 


i, p+q+r A, pt+stt ff, qt+s+u A, r+i+u? 

















piqiris!t!w! 
(42) 
1 @ @ 
erie. S 1 j p—i?—ite 
where A, ; mn * let [He i at dx. (43) 
To evaluate F, ; and H, ; we note that 
ppebna? ; 1 1 @ 
fal —i?—ite Jp = © te 
on Bu He dt (— i) (2m) dai (e )s 
1 mae 
= — >= Ct = —t P 
and therefore Ff; (~iyi "as |, a me* )dx. (44) 
The repeated integration by parts gives 
<j: R= (-i 41!) (j-1 = 0) 
=0 (j-—/ even) 
—i/All(j-l-1)! (45) 
BE as SO (j—1 odd). 
(2z)# 2W-0(1——*) | 
l>): E,=,;- eR j[ ete de 
ia 11 (— iF fm) T-DiJo 
(—i)f (Ll!) 2-7—» r=) 
= Bay i 
In the case of H, ; it is easily shown that 
H,,=0 — (jodd) om 
= 2h; (j even). 
3-2. Illustrative example 
F,,)* (Fi.e)* F, »)? (Fi,0)? 
1, 1, 1, 1] = ( 1.0) Ypy+ Cust 10) Xpi; 
F, (Fi)? F, F,,)4 F,, 3)? (Fi,0)? 
+ elas ES pypat Eph uel Cust 1.0) Lpi; 
ice (48) 


where i, j, k, / are all different. The absolute moment (1,1, 1,1) is obtained from (48) by 
replacing F, ; by H, ;. The substitution for F’s and H’s gives 


1 1 1 1 
(1,1,1,1) = Fat gest gra MPs — go UPsPie t+ 16 2PisP a 


1 1 l 1 
+ Ten Up? ; Pat i = Pig Pin Pix + Sa LP i;PikPat rai Lp; +... 











30 Absolute moments of the multivariate normal distribution 


4 
and (l,1,11)= 7 (1+ 42 p3; + Up; PinPjn + ga UPty + FZPi; Pin 


— E2 pi; Plat UP isPixPuPat ---)- 
It may be observed that 1, = 1, = 1, = 1, = O reduces [I,, /,, 1, 1,] to the integral {ff p(x) dx 
taken over the positive quadrant (see Moran, 1948; Kendall, 1941, 1945). 


4, APPLICATIONS TO THE DISTRIBUTION OF A ‘TOTAL ERROR’ 
4-1. The moments in special cases 

Let us now consider the distribution of the total error 7’ mentioned in the introduction, 
confining the discussion to two or three normal and mod-normal component errors. The 
variances of the component errors will in general be unequal, but the problem can be reduced 
to the consideration of the following standard forms of 7’, where, in conformity with the 
preceding theoretical treatment, the z’s have all unit variance: 

Case: (i) 2, +c¢|2|, (ii) |x, |+e|as|, 

(iii) ¢,%,+0¢,|%_|+¢3|23|, (iv) c,]%,]+c,| x2 | +c, | x5 |. 

The distributions of x,, x, and x, 2, X, are bivariate and trivariate normal respectively 
with zero means; ¢, C,, c; lie between —1 and +1; c, = 1 but is retained as such when 
dealing with case (iv) below to shorten the writing of moments. 

The first four moments of 7' calculated with the help of the absolute moments are given 
below for cases (i), (ii) and (iv). In the last case, the moments yu; have been given about zero 
and not the mean, as the expressions for ~, and , are very lengthy: 


Case (i). T = 2, +c|2,|. 


4 2 
A= c T° 


fe = L+0t(1-), 





7 
(49) 
2 4 
m= fe 3r+o(,-1) | 
4 12c4 
fg = 3[(1 +07)? + 4c2p?] ee [3c?(1 + 2p?) + c4] - ae 
Case (ii). T = |x,|+c|2,|. 
n= [Fa+e, | 
fig = 1408+ [2e(Y(1—p*) + psin-p)— (1 +0), 
2 2\% els 
pa = [= 18e(1 +e)p*—(1-+0%)}+ (7) [201 +e)?— Go(1 +0) (V(1—p*) + psin-*p)), 
| (50) 


Mg = 3[(1 +0°)? + 4c%p?] 
+= [61 —o*)—8(1 +c) (1+¢3)—12c(1 +c)? p? 
+ 4e(1 +c) (./(1 —p*) (2+?) + 3p sin“ p)] 
2 
+ (") [12c(1 +c)? (,/(1 —p?) + psin- p) — 3(1 +-c)4]. 








9) 





A. R. Kamat 31 


Case (iv). T = c,|2,|+¢.|x_|+¢3| x5]. 


f 2 
2 hae [pe ) 


, 4 7 9 4 ° 
fz = Lept+— Le,e,(y (1 — pis) +; 8in pj), 


; 2 js F 
bs = Jz [2Lcf + 3UcFe,(1 + p7,)] 





“)! I i i + Pig —PikPik \ 
+ (=) 6c,cy¢3] wo! +2(p;;+ pipp; sin-(2 oy LIKE JE q + (51) 
fy) Seca O04 + Pudi py ee 


fg = 3Ec} + 6Xc?cF(1 + 2p?;) 
ee : : 
bar {Zeje,[/(1 — p3,) (2 + p7;) + 3p;; sin p;5] 
+ 3[ Le, 65¢F((p45 + 2px P jx) SiN pjs + (1 — P3;) (1 + Pix + PFx))]} 
(1,j, 4 = 1, 2,3) 





where , used in the expression for 3, is the determinant of the correlation matrix of the 
trivariate normal distribution of 2,, 29, 2’. 


4-2. Derivation of probability integrals 


To consider the distribution of 7’, we note that the probability density functions for the 
first two cases are, respectively, 


P(X, | %q|)=p(x,y) = f(x,y; p)t+S(a,y; —p) (-w<a<0,0<y<o) (52) 
and p(| 2, |,| 22 |) =ple.y) = fle; p)+2fley; —p)] (0<t <a), (53) 
where f(x, ys p) = (2m) (1 — py texp { — }(x* + y? — 2xyp)/(1 — p?)} 


Case (i). T = x,+c| x. 
The probability integral may be written as 


Pr{T <d}=Pr{x+cy<d} = { S(x,y; p)dady+ i) S(x,y; —p)dudy 
z+ey<d z+cy<d 


= (1,+1,), where—wo<2<0,0<y<o. (54) 


By appropriate linear transformations the integrals 1,,J, may be expressed in terms 
of functions of the type 


ppl k) = (2m)-*(1 p81 [* [exp | — 7a (a+ 9° — 2p} dey. 


This function is tabulated in Karl Pearson’s (1931) Tables for Statisticians and Biometricians, 
2, Tables VIII and IX. Thus, for d> 0, 


I, = 4-P,,(d,,9), py =(p+e)/y(1+c%+2cp), d, = d/,/(1 +c? + 2cp), (55) 
and for d<0 (d=-—d',d’>0) 
T, = p_p(de,9), py = (pte)/y(1+e%42cp), dz = d'/./(1 +0? + 2cp). (56) 


I, is obtained from (55) and (56) when p is replaced by —p. 








32 Absolute moments of the multivariate normal distribution 
Case (ii). T = |x,|+c| 2]. 
Pr{T <d}= Pr {x+cy<d} = af {| f(x,y; p)dady + {{ f(x,y; - p) dey | 
d d 


z+ey< z+ey< 


= Ah+1,} (0 < ‘ < co} (57) 
Here it is necessary to consider c > 0, c< 0 separately. 
c>0: 1; = p,(0,0)—p,,(dy,9)+Pp(41,9), py = (9 +¢)/J(1 +e? + 2cp), 
P2 = —(1+ep)/J(1+0%+2cp), dy = d/,/(1 +c? + 2cp). 


(58) 
c<O0(c = —c’,c’>0): 
(i) d>0: Ly = p,(0,0)—p,,(ds,0), ps = (p—c)/y(1+e—2c’p), ds =d/y(1+c'2—2c’p), 
(59) 


(ii) d<0(d = —d’,d’>0): 
Ts = Pp(44,9), Pg = (c'p—1)[(1+e%—2c'p), dy =d']J/(1 +c —2c’p). (60) 
The corresponding expressions for J, are obtained from (58) to (60) by changing p to —p. 
Case (iv). T = ¢,| x,|+¢,| x2 | +0, | x5]. 
It is possible to find the probability integral of 7' for specific values of c., c; by numerical 
quadrature, although the procedure will, of necessity, be long and tedious. Alternatively, 
a Pearson curve having the exact first four moments may be fitted to give an approximation. 


To explore this approach the simpler case of uncorrelated x,, x,, x, was investigated in some 
detail. In this case the first four moments of 7 are: 


’ - 


+ (i,j = 1, 2,3). (61) 





4 12 2\2 








Setting c, = 1, =( -*) ) 
g-% 5 (1+c3+<¢3)? 
. (1-2) (1 +¢3+¢§)®” 
Tr 
3 > (62) 
(ma) asdee 
fy = 3+ 12) t+ e§)* 
7, J 





Table 1 gives f,, 2, for different values of c, and cs. It is seen from the table that the (f,, 2) 
points lie in the rectangle 0 < £, <1, 3< £,<40n the /,, £, diagram. When cg, c, are positive 
the position of the points does not change very much and is always in the Type I region, but 
if c., c, are negative it may cross over to Type VI region and then swing on to Type IV region. 














A. R. Kamat 33 


4:3. Adequacy of Pearson-curve approximation tested in special cases 


That a Pearson curve may often give a good approximation to the probability integral 
(possibly even in the correlated case) is illustrated by the following examples. 

Example (i). For the uncorrelated case, with 7’ = | z,|+|22| it is found from (61), (62) 
by putting c, = 1, cs = 0, wy = 1-5958, o(7') = 0-853, 8, = 0-495, 2, = 3-435. The upper and 
lower 5 % points for the corresponding Pearson curve can be obtained from Pearson & 
Merrington’s (1951) table; these values are compared below with those obtained directly by 
the use of the formula (58), using inverse interpolation: 


Lower 5 % Upper 5 % 
True 0-38 3°17 
Pearson fit 0-43 3°17 


Table 1. Values of £, and £,—3 (bold type) for the distribution of T = | x, | +c|x_| +¢3| 23 | 
when x, X2, 23 are uncorrelated 





l 
—10 | -08 | -06] —04 | —0-2 | 


0-0 | 02 | o4 | 06 | o8 | 10 |% 


aX 


—1-0 | 0-0367 | 0-0141 | 0-0035 | 0-0°40 | 0-0575 | 0-000 | 0-0575 | 0-0340 10-0035 0-0141 |0-0367 | —1-0 
-290 ‘ 332 -290 














301 .377 | -418 | -435| -418 | -377 | -332 | -301 | 
~0-8 -of48 | -o092 | -0305| -0481 | -0535! -0514| -o518!| -0614 | -0836 | -120 —0-8 
-304 | -334 | -385 | -435 | -455 | -435 | -385 | -334 | -304 | -301 
-06 0628 | -146 | -217 | -242 | -226 | -203 | -195 | -208 | -240 —0-6 
-370 | -435 | -502 | -531| -502 | -435 | -370 | -334 | -332 
~0-4 328 | -494 | -556/} -511 | -431 | -374 | -356 | -368 | —0-4 
‘524 | 620 | 663 | -620 | -524 | -435 | 385 | 377 | 
~0-2 761 | -867| -786 | -639 | -527 | -473 | -463 | -0-2 
748 | -805| -748 | -620 | -502 | -435 | -418 | 
0-0 0-991 |0-895 |0-718 {0-582 |0-513 | 0-495 | 0-0 
-869 | -805 | -663 | -531 | -455 | -435 | 
0-2 11 | -659 | -541 | -483 | -470 | 02 
748 | -620 | -502 | -435 | -418 
0-4 548 | -462 | -422 | -419 0-4 
524 | -435 | -385 | -377 
0-6 .399 | -370 | -370 0-6 
370 | -334 | -332 | 
0-8 342 | 340 08 
+304 ‘301 ¢ 
1-0 330 | 10 
} -290 






































Example (2). The mean successive difference with the sample size three, i.e. 
T = }{|2,—2x_|+|22.—25 |}, 
is an illustration of a correlated total error. In this case it can be shown using (50) that 


My = 1-1284, o(7) = 0-670, f, = 1-036, f, = 4-296. 
Biometrika 40 3 














34 Absolute moments of the multivariate normal distribution 


The lower and upper 5 % points for the corresponding Pearson curve as obtained from the 
Pearson-Merrington table are compared below with those obtained for the exact dis- 
tribution by using (58): 


Lower 5 % Upper 5% 
True 0:27 2-40 
Pearson fit 0-28 2-40 


Example (3). The distribution of the mean deviation has been obtained by Godwin (1945). 
Considering the case of the sample size three, we have, using (51), 


wi, = 06515, o = 0-342, f£,= 0-417, By = 3-286. 


The following is a comparison of the actual percentage points with those obtained from the 
Pearson-Merrington table: 


Lower 0:5% Lower 5% Upper 5% Upper 0-5 % 
True 0-052 0-166 1-276 1-703 
Pearson fit — 0-009 0-157 1-270 1-696 


It is to be noted that the last figure in the second line is not reliable since the Pearson- 
Merrington table gives only two places of decimals for the standardized deviate. 


Before concluding I should like to express my grateful thanks to Dr H. O. Hartley for his 
advice during these investigations and to Prof. E. 8. Pearson for helpful suggestions in the 
writing of this paper. 


REFERENCES 


Arey, N. & Harp, A. (1950). Math. Tidsskr. B, 86. 

Gopwin, H. J. (1945). Biometrika, 33, 265. 

KENDALL, M. G. (1941). Biometrika, 32, 196. 

KENDALL, M. G. (1945). J. R. Statist. Soc. 108, 93. 

Morav, P. A. P. (1948). Biometrika, 35, 203. 

Naseya, S. (1951). Ann. Inst. Statist. Math. 3, 2. 

NaBeya, S. (1952). Ann. Inst. Statist. Math. 4, 15. 

Prarson, E. 8. & Merrinoeton, M. (1951). Biometrika, 38, 4. 

PEARSON, K. (1931). Tabies for Statisticians and Biometricians, 2. 
London: Biometrika Trust. 

Tricomt1, F. (1936). G. Ist. Ital. Attuari, 7, 280. 





5- 


1e€ 


n- 


is 


he 





[ 35 ] 


ON THE RANGE OF PARTIAL SUMS OF A FINITE NUMBER 
OF INDEPENDENT NORMAL VARIATES 


By A. A. ANIS anp E. H. LLOYD 
Imperial College, London 


The properties of the range of the partial sums S,, S,, ..., S,, of the independent variates X,,X,, ..., X, 
are important in the theory of storage capacity. The asymptotic distribution, for large n, is known 
(Feller, 1951). The present paper discusses the mean value of the range for finite n, in the case where 
the X; are standard normal variates. This is shown to depend on the integral 
(oo 2) 
(2m |. (r) oi exp— {yf + (Yi—Ya)® +--+ (Yra— Ye) + YF} dys --- EY pr 


whose value is found to be (r+ 1)-?. 
The expected value of the range is shown to be 


2\ n-1 
F (-) ort. 
n 1 
1. INTRODUCTION 


The present investigation was suggested by the problem of planning the storage capacity 
of a reservoir, where one would like to know the distribution of the water level over a given 
number 7 of years. The level after r years may be regarded as the sum ofr annual increments, 
and we may approximate this real problem by the ideal one where the annual increments 
are independent variates with a common distribution. Applications to other storage 
problems are obvious. 

In the case of very large n the asymptotic range of the consequent random walk has been 
discussed by Feller (1951). In the present paper we are concerned with an exact formulation 
for the case of finite n, taking the increments to be standard normal variates, and for this 
case we obtain a simple expression for the mean range. 

The paper falls into two parts. In the first (§§2—5), the problem is reduced to that of 
evaluating the integral 

* ti ar 
: es exp| > Yi¥in— Sail dy 
In the second part (§6) this integral is shown to have the value (,/(27))’ (r + 1)-?, whence the 
expected value of the range, over n years, is found to be 


Ie 


2. STATEMENT OF THE PROBLEM 


We consider n independent variates X,, X;,...,X,, each with the same probability density 
function ¢(xz), and their partial sums 


S, = X,+X,+...4+X, (r= 1,2,...,). 
We let U, = Max{S,} and L, = Min{S,}; 
r 


3-2 











36 Range of partial sums of normal variates 


then the mean range is &(U,,—L,) = &(U,)—&(L,,). Let f,(x) be the probability density 
function of U,,, and let F(x) and 1—G,,(x) respectively be the distribution functions of U,, 


and DL,» so that F(x) = Pr (U,, < x) and G,,(x) = Pr (L,, 2 x). 


We have F,(y) = 1 (n) fT (x,) dx; 
et, «Oe 
r=1,2 n 
and G,(y) = (n) ... Il (x;) dx;. 


r=1,2,....% 


This latter may be alternatively written in the form 


Gy(—y) = fn) ..[ T1 6(- adda 


Ua<y, 
r=1,2,...,” 


corresponding to the observation that Min {S,} = — Max {—S,}. The functions F, and G, are 
Tr - 


thus integrals of the same type with identical boundary conditions. If ¢(x) is an even 
function it follows of course that F,,(y) = G,,(—y). 


3. &(U,) AS A LINEAR FORM IN THE F,,(0) 


If we make the transformation 
r 
Ue =Yo-Yr OF = Yra— Yr 
the above expression for F,, becomes 
‘oO on 
F,,(Yo) -{ +++ (n) ial TI P(Yr-1— Yr) dye 
0 0 r=1 
It will be convenient to define F,(x) =1. It is clear from this formulation that 
F,41(Yo) = I P(Yo— Y1) F,(y1) dy, (3*1) 
whence, on differentiating and then integrating by parts, 


Fass(Yo) = F,(0) $(yo) + f * Hlvo—vdSaltn) dan. (3-2) 


Using this as a reduction formula we are led to 


fnsslYa) = & Fy-o(0) hye) (3:3) 
where 


h,(Yo) T [. | B(Yo— Y1) P(% — Ye) a P(Y,-1 —Y,) $(Y,) dy, oer dy, (r = 1, 2, eoeg n), 
and hoy) = P(y). 








A. A. Ants anp E. H. Lioyp 37 


+0 
The expected value y,,, of U,,, is equal to | Xfn41() dx, and, using (3-3), we obtain 


nr 
Paw = = q, F,_(9), (3-4) 
+0 
where q, = . p Yoh,(4) dyo- 
. os 
Since | _, PoP Yo— 41) Ao =" 


it is easily seen that 


d= tC: (r) a Yr P(Y1 — Yo) --- PYr-1— Yr) PYr) AY: --- dy, (7 =1,2,...m), (3-5) 
with J = 9. 
4, AN IDENTITY IN THE DISTRIBUTION FUNCTIONS 

If we consider a particular S, we have 

S,_, = S,—(X,+ X14 +... +X,_i41) = §,-7j, say, (¢ = 1,2,...,7—1), 
and S.45 = S,+ (Xia + Xaet--- +X,4;) = 8,4+R;, say, (j = 1,2,....n—-1). 
If S, is the greatest of the partial sums we must have 

7,20 (¢=1,2,...,.7-1) and R,<0 (j = 1,2,...,.n—7), 


and, in particular, Min {7;} > 0, Max {R,;} <0. Now T; and R, are independent partial sums 
i j 
atin Pr (8, = Max {S;,}) = G,_,(0) F,_,(0). 
k 


Further, since some S, must be the largest, the probability being 0 that any two partial sums 
are equal, we have 


x G,_,() F,_,(0) =], 
r=1 
Where ¢(z) is an even function this becomes 


z F_,(0) F,_,(0) = 1. (4-1) 


5. THE SPECIAL CASE OF NORMAL X; 


We now turn to the special case where the X; are standard normal variates, so that 
¢(u) = ¢(—u) and ¢’(u) = —ud(u). Applying this latter result after differentiating (3-1) we 
get, with a slight change of notation, 


fnsly) 7 [. (x—y) ply — x) F,(2) dz. 
If we multiply (3-1) by y and add to this the last result, we find 


fal) +y Fil) = | * ely —2) F(a) de, (6-1) 











38 Range of partial sums of normal variates 


Taking n = 0,1,...,r—1 in this formula, and applying the results to the reduction of the 
expression (3-5) for g,, we find 


qd, = in (r- nf hws P(Y2— Ys) --- P(Yr-1 — Yr) PLY) AY --- UY, 
0 0 
+... + a [efor P(Y,—1 = Yr) (y,) dy,_,dy, 
+ {"f-alun) dun) dye + [we F alee) 6.) Ae (5-2) 


The last term in this expression equals -{" F__,(y,) $’(y,) dy,, and this may be integrated 
0 

to give eS 

F,_(0) $0) + {" faye) (Ue) dy 


This last integral is identical with the penultimate term of (5-2). If in this we substitute for 
f,-1(y,) according to (3-2), the integral becomes 


[hau $(Y,) dy, bs F,_2(9) : P*(y,) dy, + [fete P(Yr—-1 Th Yr) P(y,) dy,_,dy,. 
0 0 0oJ0 


The last integral here again coincides with a term of (5-2). Continuing in this way we finally 
obtain for (5-2) 


q,= ¥ 8b,F_,(0) (r= 1,2,...5n), (5-3) 
where s=1 


b, = {° (s—1) i) ” $(p-213) P(Yp-o+2 —Yroaa) --- PYp-1—Y-) G(Y,) Ep—eae --- EY, 
(s = 3,4,...,m), (5°4) 


with b, =9(0) and b= i) * dy) dy. 
0 
Using the expansion (3-4) for ,,,, we obtain 
Pau = y a 8b, F,_,() F,_,(0), 
r=ls= 
which, with the aid of the identity (4-1), reduces to 
Kasi = & bby (5-5) 
s=1 


The problem is thus reduced to that of evaluating the b,. It is convenient to rewrite (5-4) 
in the form 


b, = (27)-*c,_, (s = 1,2,...,2); (5-6) 
where, explicitly now, ¢=1, «= I e-v* dy, 
0 
C,= I. ()[’ exp— HYi+ (Yi—Ya)* + --- + (Ys-1—- Ys)” + YS} Ya --- Up (8 = 2,3,...,m— 1). 
(5-7) 


6. EVALUATION OF THE ¢, 


The remaining part of the paper is concerned with evaluating the c,. We use matrix methods, 
with the convention that lower- and upper-case bold-face letters denote column vectors 
and square matrices respectively. An expression such as 


A>0, 





) 





A. A. Ants anp E. H. Lioyp 39 


where A is a vector of order k, means 


A,;20 (= Re | 
We write (5-7) as 


¢, = [w e-tVA¥ dy, ... dy,; (6-1) 
y>0 
where A is the (r x r) matrix 
2-1 0 0 
-l 2-1 0 0 
0-1 yet ee ae 
A= 
0 0-1 2-1 
0 0-1 2 


This matrix is symmetric and positive-definite, and may therefore be resolved into tri- 
angular factors: A=TT 


where T is upper triangular. It is easily found that 








The exponent in the integrand of c, in (6-1) thus becomes 
y Ay = y’T’Ty = wu, 
say, where u = Ty. 


Let us make this substitution in the integral. The Jacobian is 


1/| A] = 1/y(r+1). 
Then (6-1) becomes 


(r+1)tc, = I (r) fewwrdu, oo slliilge (6-2) 
T-'u>0 
The region of integration may be written in the form 
u=Tz (z2>0), 
or, slightly more conveniently for our purpose, as 
u=Lx (x20), (6-3) 
where L=T/,/2 


= (Ay, Ag, veep Ay) 











40 Range of partial sums of normal variates 


and the columns A; of L are defined explicitly by 
Ay = (1,0,_,), 


' 8-1 /s+1 
A; = (0.-» “? en (s = 2, rN 3 E 


Here 0, denotes the zero vector of order i. We note that the A; are normalized, i.e. 
AA; =1 (¢ =1,2,...,r). 
We now introduce a vector A,,, defined by the relation 


A, t+Agt+... +A,,, = 0. 
It is easily verified that we then have 
NAw. =— 
— + (i = 1,2,...,r+1), 
NMA; =9 (j>1) 


the suffices being reduced modulo r + 1 when appropriate, and that the new vector A,,, is 
normalized. 


We now introduce the following (r x r) matrices: 


L, = (As, As; ooo Ag) 
L, = (Ag; Ag, veep Apa Ay) 


CORPO O HH eee H eee ee ee Ee Eeeeeee 


Lass => (A; Ao, sees A,); 
the last of these being the matrix L of (6-3). It may be seen that 
LiL, = LiL, = ... = L,,,L,,,(= $A), 
whence the L; may be shown to be orthogonal transforms of one another; for let 


PL,;=L; (i+)); 


then L;P’PL,; = LL, 
= LiL, 
so that PP = L; LiL, L,- = I, 


where I denotes the unit matrix. Hence P is orthogonal. 
Since the integrand of c, is invariant under orthogonal transformations of u, we may 
therefore describe the region of integration (6-3) in any of the alternative forms 


&@,: u=Ly (y20;i =1,2,...,r+1). 


Hence (r+1}he, = |  (r) ") | = Jw (r) | = vr) { >. ac (r) |, (6-4) 


Rr+ie ~ P+ i i=1 


where in each case the integrand is that of (6-2). 





By 





A. A. Ants anp E. H. Liuoyp 41 


We now prove 

(i) that the r + 1 regions Z; are non-overlapping, so that the sum of integrals in (6-4) may 
be replaced by a single r-fold integral over the union of the #;; and 

(ii) that the regions #@; are exhaustive, so that their union is the whole space. 

Integration is then immediate. 


(i) The regions are non-overlapping 


Consider, for example, the regions #, and Z,. If u is any point common to them, we shall 


have u= L,x = L.y (x>0,y>0). 


Then LiL, x = LiL, y, 
or, explicitly, 
1-} 0 0 ee Ae —} 
= ie oe 0 bik @-. 0 
Org Long Bag «hf § = y 
0 0o-} 1 0 -4} 1 —$4 


The solution of this set of equations is 
,=Y,1-Y, (8 = 2,3,...,7r), 
L, = —Y,. 
Since by hypothesis the elements of x and y are all non-negative, it follows that 
= y, = 0. 
Any common points u thus lie in an (r — 1)-dimensional subspace, whose measure in r-space 


is of course zero. Speaking geometrically, we may say that the regions have common 
boundaries but are non-intersecting. 


A precisely similar argument applies to all other pairs of regions #;,#;. Hence 


c, = (r+ 14 f(r [wed ...du,, (6-5) 
R 
where the region of integration # is the union of the regions #;: 
r+1 
4 = U R;. 
i=1 


(ii) The r+1 regions &; are exhaustive 
We have to prove that every r-dimensional vector is of the form 
u=L,y (y20), 


for some r. Now every set of r of the basis vectors A, is linearly independent, so that we may 
sei cahanmel vu=L,x=L,y=—... =L,., Ww, 
for any u, with appropriate values of x,y, ...,w. Let us write these representations in the 


forms 
u= Ae%_+ eee +A: 2a 


eee eee eee ee eee eee eee eee ee eee) 


U=A,W, +... +A,,. 











42 Range of partial sums of normal variates 


Our proof is by induction. Assume for a particular value of s that, if in one of these 
representations the number of negative coefficients does not exceed s, then u lies in one of 
the regions #;. Now consider the case where in some representation (say the last) there are 


8+1 negative coefficients. Suppose, as we may with no loss of generality, that these are 


Wy, +++) We41, 80 that w, = —0, G = 1,2,...,8+1), 


and w,= +0; (j =8+2,...,7), 
where all the w, are non-negative. 
We replace A, by —A,—...—A,,,. The representation then becomes 
U = —WyAy— ... — Wy 1 Agit t+ WeraAgigt --- + W,A, 


= (Wy — Wg) Ay +--+ + (Oy — Oy p1) Agya + (Oy + Wy pa) Agia t+... + (W, +O,)A, + 042,44. 
This is a representation of u in which the only coefficients that can possibly be negative are 
those attached to Ag, ...,A,,,. Thus there are at most s negative coefficients, whence, if our 


initial assumption were valid, it would follow that the point under discussion must belong 
to one of the regions #;. 


Now the assumption is certainly valid for s = 0, and is therefore true in general for 
s = 0, 1,...,7. Thus any r-dimensional vector whose negative coefficients in some repre- 
sentation do not exceed r in number, must belong to one of the regions #;, and since this 
clearly exhausts all the possible sign variations of the r coefficients our lemma is proved. 
It follows then that the union of the #; is the whole space. Hence (6-5) becomes 


+0 + 00 
(r+ pte, = | mf eu dy, ...du, 


* et du)" = (,/(27))’. 
Referring back to (5-6) we have 
b, = (27)-#c,_, = (2773)-4, 


n 
and finally, by (5-5), V(27) faa = Dr. 
r=1 
In conclusion, we note that the asymptotic value of the mean range, for large n, is 


2 ‘ 
2 | (-) nt, in agreement with Feller’s results. 


REFERENCE 


FELLER, W. (1951). The asymptotic distribution of the range of sums of independent random variables. 
Ann. Math. Statist. 22, 427. 





les. 





[ 43 ] 


NOTE ON ‘THE JACOBIANS OF CERTAIN MATRIX 
TRANSFORMATIONS USEFUL IN MULTIVARIATE ANALYSIS’ 


By INGRAM OLKIN 
Michigan State College* 


SUMMARY AND INTRODUCTION 


General techniques for the evaluation of the Jacobian of a matrix transformation were 
outlined by Deemer & Olkin (1951). In particular, (a) the Jacobian of a non-linear trans- 
formation is equal to the Jacobian of the linear transformation in the differentials, (6) by 
the introduction of suitable variables, the Jacobian is equal to a product of Jacobians which 
are easily calculated. This note is concerned with a number of transformations not con- 
sidered in the earlier paper. Certain of the transformations considered will be recognized as 
extensions of those given previously, while others are quite different. The notation and 
results of the first paper are assumed. All matrices in this note are square of order p. 


THE JACOBIANS OF MATRIX TRANSFORMATIONS 

THEOREM 1. The J of the transformation 

p ; 
Y=XA is D(X; ¥) = [az **. 
i 
i 
Proof. yi; =  X 44,44; (i >j). The scheme of coefficients is triangular with diagonal elements 
k=j 
Gy; (p —k + 1) times from 
Oy, (Ory, (t= k,...,.p; k= 1,2,...,p). 


Pp 
Hence the determinant is [] a?;-*t?. 
1 


THEOREM 2. The J of the transformation 
Y=R'4+A'X is D(X; Y) = 2 [Hah 
i 
Proof. Pre- and post-multiply by A’-! and 4-1, respectively, and let Z = A’ YA-, 
0 = XA-. Then Z = 0 +0’ and 
D(X; Y) = D(X; 0) D(O; Z) D(Z; Y). 


We can now evaluate each component 
i a Pp “ 
D(X; 0) = Tl age-*», D(Z; Y) =|A |e! = J] aR*1, DO; Z) = 2. 
1 1 
CoroLuary 2. The J of the transformation 
- aie — ae 
V=T7T is DT; V) = 2° JT t,. 
1 


Proof. Taking differentials of V = 7’7' and making the correspondence (dV) = Y, 
(a7) = X, T = A, of Theorem 2, the corollary follows. 


* The work reported here was completed at the University of North Carolina and was supported 
in part by the Office of Naval Research. 











44 Matrix transformation in multivariate analysis 


THEOREM 3. The J of the transformation V = D, RD,, where r;; = 1, is 
D(x, R; V) = 2 [1 2?. 
1 


Proof. v;; = x3, 04; = X;%;7;;(t+j). Hence the scheme of coefficients is 


x; ‘ij 


v, | 2D, 90 
Vij | L D 
where D,:pxp and D,: p(p—1)/2xp(p—1)/2 are diagonal matrices with elements 
2a, ..-, 2, and 2, %y,%1Xp, ..., Zp Xp, respectively. The matrix L does not affect the value 
of the determinant, which is equal to 
Pp Pp 
2° Tx, I] 2,2; = 2” TT 2?. 
1 i<j i 


i 
THEOREM 4. The J of the transformation 7 = D,O, where > u3; = 1(i = 1,...,p) is 
j=1 


D(x,0; T) = il afta). 
1 





1-1 t 
Proof. t,; = UNG = x,(1 — 7 us, ’ ti; = Ui (a >J). 
Hence the scheme of coefficients is 
Gi Uy 
t;|D, —-H#, 
ty; | Ey D, 





where D,:pxp and D,: p(p—1)/2x p(p—1)/2 are diagonal matrices with elements 
Uy1)+++)Upp and x, of multiplicity (i—1),(i = 1,...,p), respectively. The latter arise from 
Ot;,;/Ou,; = x,(j = 1,2,...,4-1). Hy: pxp(p—1)/2 has elements 2,u,,/u,, arising from 
Ot;,/Ou,;(j = 1,2,...,4-—1; ¢ = 1,...,) and zeros elsewhere. E, : p(p—1)/2 x p has elements 
u,; arising from dt,,/dx,;(j = 1,2,...,i-1;i=1,...,p) and zeros elsewhere. The deter- 


minant of partial derivatives is 
i-1 
oa x “) 
ar. ee 
Un 


gE i 
THEOREM 5. The J of the transformation R = 00’, wherer,, = 1, Du; = 1,(i = 1,..., p) is 
j=1 








p 
| D,| | D,+ #, Dy*E,| = | a 


D(O; R) = wee. 


i 
Proof. ry, = 1, 743 = LY Upy Uy; (i <j). The scheme of coefficients is triangular with diagonal 
k=1 
elements u;, of multiplicity (i—1) arising from @r,,/Ou,;(j = 1,...,i—1). Hence the deter- 


Pp 
minant of partial derivatives is [] u?;-*. 
1 








nts 
lue 


nts 
om 
om 
nts 
er- 


) is 


nal 


er- 





INGRAM OLKIN 45 
THEOREM 6. The J of the transformation Y = (A + X)-1(A —X), where A is a matrix of 
we eS, & D(X; Y) = 2°"| A|P| 44 |-”. 
Proof. Taking differentials of Y = (A + X)-1(A —X), we get 
(dY) = —2(A+ X)-1(dX)(A+X)*A. 
Using Theorem 3-6 of the earlier paper, 
D(X; Y) = |2(A+X)|?|(4+X)A |, 
THEOREM 7. The J of the transformation 
Y =(1+X)*(I-X) is D(X; Y) = 2-@+n2| 1 +X |-7+, 
Proof. Taking differentials of Y = (I + X)-1(I—X), we get 
(dY) = —2(1+X)1 (dX) (1+ X)7. 
Using Theorem 3-7 of the earlier paper, 
D(X; Y) =| 2+ X)3 |24. 


THEOREM 8. The J of the transformation Y = (4 + X)-1(4 —X), where J is a matrix of 
scalars +0, is : 
D(X; Y) = 2-@+ne | A +X |-@+» J] ag -*+1, 
1 


Proof. Taking differentials of Y = (4 + X)-1(4 —X), we get 
(d¥) = —2(44+ X)2 (dX) (44+ 2X). 
Using Theorem 3:8 of the earlier paper and Theorem 1, 


D(X; Y) = i QE (agg + 45)" (gg +1 4q)- PH ae 3, 
THEOREM 9. The J of the transformation Y = 7'D,7’, where ¢,; = 1, is 
D(t,xz; Y)= Tap 
Proof. Taking differentials of Y = 7'D,7’ and pre- and post-multiplying by 7-1 and 


T’-1, respectively, a 
T-dY) 7’ = T-(dT) D, + (dD,) + D(dT)'T’—. 


Let Z=T-dY)T’, D,=dD,, 0 = TdT), 
where u;; = 0, then Z = OD,+D,.+D,0' 
and D(T x; Y) = D(dT, dx; dY) = D(dT,dx; 0, dw) D(O,, dw; Z). 


24; = w,, and z;; = u,;z; and hence the scheme of coefficients for D(O, dw ; Z) is 


Wy ty 


a, | 1 0 
z,|L D 











46 Matrix transformation in multivariate analysis 


where D: p(p—1)/2x p(p—1)/2 is diagonal with elements x; of multiplicity (»-—i) 
(i = 1,...,p). As the matrix L does not affect the determinant, we obtain 


7m p 
D(O, dw; Z) = | D| = TI 2?-}, 
1 
D(a?, dx; O,dw) = | P |-0 Ty tz = 1. 
1 


THEOREM 10. The J of the transformation Y = ['7'T’, where I is orthogonal such that 


|T+I| +0, is D(X; T; Y) = 2@-w2 | 74 X|---» TT (t,,-t,,), 
i<j 

where l= (+X) (1-X). 

Proof. Taking differentials of Y = '7T’, and pre- and post-multiplying by I’ and I, 
respectively, (dy) = l(a) 1+ (dT) + Mary’. 
Let W=Irdy)r, S=F(dr), 0 =d7, 
then &: . 

(1) W =S8T+0-TS, 


(Note. I’(dT) = Sis skew-symmetric, hence Sy 
D(X, 7; Y) = D(dX, dT; dY) = D(dX, aT; 8,0) D(S,0; W)(W; dY)=D,D,D,, 
D, = 2h~-v2 | J +X |-e-», D,=1, 
D, arises from (1), and the scheme of coefficients is 
ea Mag Ft 

Yi rout 

yj(i<j)|0 I M, 

y(i<j)|9 0 N 
the determinant of which is | N |. N is a triangular matrix with diagonal elements ¢;; —t;;) 
(I +)) arising from dy,,/0s;; = t;;—t,;, where 





Pp 1 
Yiz = DX Sphinn — Li Seg tes: 
k=j k=1 


Hence | V| = II (ts —tys)- 
i<j 
CoNCLUSION 

In this note, the Jacobians of certain matrix transformations have been evaluated. 
Theorem 2 is concerned with an alternative form of the Toeplitz factorization which is 
a rotation of the rectangular co-ordinates. In Theorem 3, the relation between the correla- 
tion and covariance matrices is stated. In Theorem 4, a set of normalized rectangular 
co-ordinates is introduced, and in Theorem 5, these are related to the correlation matrix. 
In Theorems 6-8, respectively, the Cayley transformation for a general matrix, a symmetric 
and a triangular matrix is given. In Theorem 10, the Smith canonical form which yields 
the characteristic roots of a matrix is considered. 


REFERENCES 


Deemer, W. L. & OLKIN, I. (1951). Tne Jacobians of certain matrix transformations useful in multi- 
variate analysis, based on Lectures by P. L. Hsu. Biometrika, 38, 345. 





at 


t;;) 


fed. 
h is 
ela- 
ular 
rix. 
tric 
elds 


ulti- 





[ 47 ] 


ESTIMATION OF A FUNCTIONAL RELATIONSHIP* 


By D. V. LINDLEY 
Statistical Laboratory, University of Cambridge 


It is known (Lindley, 1947, § 7-2) that if there is a linear functional relationship between 
two quantities, if the only observations available on these quantities are subject to error 
and if all the distributions are normal, then there is no satisfactory estimate of the functional 
relationship unless the relative accuracy of measurement of the two quantities, or some 
similar knowledge, is available. Furthermore, in this situation there are three relationships 
which exist between the two quantities (either with or without error), namely, the original 
functional relationship and two regression equations which, in this case, will be linear. The 
experimenter who encounters this situation is therefore faced with a difficulty and a com- 
plexity in the mathematics which he feels is foreign to his practical situation, and it is not 
therefore surprising that numerous attempts have been made to remove the undesirable 
features of the problem. The fact that the experimenter finds difficulties which he cannot 
readily interpret suggests that the mathematical model is unsatisfactory. A new model has 
therefore been suggested by Berkson (1950), and with this model he claims that the estima- 
tion is possible, that there is only one regression and that this regression has the same con- 
stants as the functional relationship. Kendall (1952) has raised some doubts about Berkson’s 
procedure, and when I read Prof. Kendall’s paper before publication I thought I agreed 
with him. Further consideration has revealed that Berkson’s procedure for estimating the 
functional relationship is sound and his mathematical model, incorporating the idea of 
a controlled observation, is likely to be of very wide applicability. In the present paper 
I give a mathematical justification of Berkson’s procedure which may help to clarify the 
issue. 

As a compromise between the notations of Berkson and Kendall random variables 
(variates in Kendall’s terminology) will be denoted by lower case Roman letters and other 
quantities (mathematical variables) by capital letters. Any random variable or quantity 
which can be observed will have a prime attached to it. We now postulate the existence 
of two quantities, U and V, which are connected by a linear functional relationship 


V=a+fU. (1) 


Nothing is said about how U and V are to be obtained; they may be obtained by a random 
process so that (1) may equally be written 


v=at+fu, (2) 


but, in view of the relationship, if one is a random variable so is the other. In the problem 
under consideration U and V are not observable; observations are only available on 


zx’ =u+d, (3) 
"=v+te; (4) 


* See editorial note on p. 49 below. 








48 Estimation of a functional relationship 


where d and e are two random variables. The problem is to estimate « and f from several 
pairs of observations on 2’ and y’. If we now assume 


(i) that wu (and hence v) is normally distributed, 
(ii) d and e are independently normally distributed with zero means, and 
(iii) d and e are independent of wu, 


then estimation of « and f is not possible and the regression lines of y' on x’ and x’ on y’ will 
not be coincident and neither will have the constants of the functional relationship. The 
estimation does not seem possible even if (i) is omitted and U is not a random variable. 
This is the difficulty referred to above. 

Berkson queries the formulation (3) and the assumption (iii). Taken together these imply 
that we have a true value u (or U) which we are unable to measure directly but in making 
a measurement on u we introduce a random error d which is unbiased and independent of u 
so that x’ is of necessity a random variable. Berkson calls x’ an uncontrolled observation. 
He suggests that observations are not always uncontrolled; often we bring a measurement 
to a value determined beforehand (whether or not by a random process will be seen not to 
matter) so that our measured quantity is fixed but the actual value of the quantity used in 
the experiment will differ by an error d from the measurement. Hence mathematically if 
we take the case where the measurement is not a random variable 


u = X'-d, . (5) 


where we may assume d to be normally distributed with zero mean, and the minus sign is 
introduced in order to show the analogy between (3) and (6). Rewriting (5) 


X' =u+d, (6) 


and u and d are no longer independent. X’ is called a controlled observation. Berkson suggests 
that in many experiments one value is controlled, the other is uncontrolled; for example 
we may bring the current to 2amp. (controlled) and measure the voltage as 3-67 V. (un- 
controlled). In this case Berkson argues that estimation of « and # is possible and the 
regression of y’ on X’ is linear with the same constant. The proof of these statements is 
simple. From (2), (4) and (5) 
y’—-e=a+f(X'—d), 

i.e. y’ =a+PX'+(e—fd) 

=a+pXx'+f, (7) 


where f = e— fd is a random variable with zero mean and normally distributed. This is the 
ordinary regression of a random variable y on a fixed variable X , and accordingly the regres- 
sion stated is established and the usual estimates 


L(y’ — 9’) (X'-X') —? ’ 
b= = , a=y'—bX’, 
a(x’ z xX’? a y 
of £ and @ are available and have the usual optimum properties. Furthermore, even if X’ 


is a random variable (which in most applications it will not be) Berkson’s argument persists 
because f is independent of x’. In the original model where we have (2), (3) and (4) we obtain 


the equation y =a+fe'+f, (8) 


analogous to (7) but x’ and f are correlated, since x’ is correlated with d, and thus the usual 
regression analysis ceases to apply. A finer analysis shows the estimation to be impossible. 











D. V. LinpLEY 49 


Mathematically the essence of Berkson’s idea is to suppose d independent of x’ (where I 
mean statistical independence in the case where x is a random variable) in contrast with the 
usual assumption that d is independent of u. The above analysis shows that in Berkson’s 
situation the problem is equivalent to that of the regression of y’ on X’ (or x’). If X is not 
a random variable then Berkson’s procedure can be justified without the assumption of 
normality by the familiar least squares argument and an appeal to Gauss’s theorem. 
Since it has caused confusion in the past it might be helpful to rephrase Berkson’s first 
conclusion (p. 173). If x’ is controlled and y’ is not, then the regression of y’ on 2’ is .of the 
usual form, has the same constants as the functional relationship, and the latter may be 
found by the usual regression methods. If y’ is controlled and 2’ is not, the regression of x’ 
on y’ has similar properties. But note that in the former case there will still be a regression 
of x’ on y’ though it will not have the constants of the functional relationship; it may not 
even be linear. So that Berkson’s title ‘Are there two regressions?’ is still answered in the 
affirmative if the controlled observation is a random variable (a case not considered by 
him), but in the negative when it is a mathematical variable, since regression of a 
mathematical variable has no meaning. Nevertheless, his procedure for estimating the 
functional relationship, which is quite a different thing, as Kendall points out, is valid. 


REFERENCES 


Berxson, J. (1950). J. Amer. Statist. Ass. 45, 164. 
KENDALL, M. G. (1952). Biometrika, 39, 96. 
Linpbcey, D. V. (1947). Suppl. J.R. Statist. Soc. 9, 218. 


[Editorial Note. In his second paper on ‘Regression, structure and functional relation- 
ship’, Kendall (1952) criticized certain of Berkson’s (1950) conclusions regarding regression 
problems when dealing with controlled experiments. To these criticisms Dr Berkson sub- 
mitted a reply which in turn led to more comments from Prof. Kendall. In considering 
these further contributions as Editor, I recalled how often in controversy disagreement 
arises from quite inadvertent misunderstanding of the other man’s terminology, notation or 
method of approach; to clear the position an independent re-statement of the problems at 
issue may be needed. The preceding note by Mr D. V. Lindley appeared to be so helpful in 
this connexion that I asked the two contestants whether they would agree to its publication 
in place of their own explanations. This plan they both welcomed; it is, however, right that 
I should add that, while Dr Berkson finds himself in substantial agreement with what 
Mr Lindley has written, Prof. Kendall is not convinced that the points at issue have all 
been cleared. E.S.P.] 


Biometrika 40 4 











[ 50 ] 


ESTIMATING PARAMETERS IN TRUNCATED PEARSON 
FREQUENCY DISTRIBUTIONS WITHOUT RESORT TO 
HIGHER MOMENTS 


By A. C. COHEN, JR., The University of Georgia 


1. INTRODUCTION 


Truncated samples arise whenever observed measurements are restricted to an interval 
which fails to include the full range of population values. Numerous biological, psycho- 
logical, sociological, economic, engineering and other scientific investigations produce such 
samples. They result from studies of ‘time to death’ or ‘time to react’ of toxicants, stimulants, 
etc., when observation is discontinued before all exposed specimens exhibit the reaction 
under study. They likewise arise when sorting or inspection procedures eliminate specimens 
above or below designated tolerance limits, as well as in many other practical situations. 
Until recently, interest in truncated samples has been largely limited to those from normal 
populations. The present paper is more general in scope and is concerned with truncated 
samples from all types of Pearson populations. 

Without resorting to higher moments as was done in an earlier work by the writer (1951), 
the present paper treats the specific problem of estimating parameters of univariate Pearson 
populations from samples truncated at known terminal points when the number of observa- 
tions thus eliminated is unknown. Using the method of moments, estimating equations are 
obtained which involve only the first four sample moments in contrast to the first six which 
were previously employed. The Pearson-Lee-Fisher estimating equations (1908, 1915, 1931) 
for singly truncated samples from normal populations follow as a special case of the present 
results. Likewise, previous results of the writer (1950a) pertaining to singly truncated type 
III distributions also constitute a special case. The practical application of the results of 
this paper is illustrated using a doubly truncated sample from a type III population. 
Throughout this paper, samples are described as singly or doubly truncated according to 
whether one or both ‘tails’ are missing. 


2. SOME FUNDAMENTAL MOMENT RELATIONSHIPS 


In the writer’s paper (1951), previously referenced, it was demonstrated that the system of 
moment equations 
hpi + Og kip + by kp, + be(k +2) Wess = Mars (k = 0,1, 2,3), (1) 


where h = a+6, and 4; is the kth moment of the general Pearson frequency function, 
follows readily from the differential equation 


1 dfx) _ a-x 2) 

Fe) de ~b,+b,2+6,2*° ( 
Equations (1) differ from the usual equations for the Pearson system only in the use of 
hin place of a. They were, of course, derived from (2) on the assumption that the expression 
(bo + b,x + b,x") x* f(x) vanishes at both terminals of f(x). For the Pearson curves of types 


IV, VI and VII having a range unlimited in one or both directions, this will not be true when 
k exceeds a certain value which depends on b,. 














A. C. CoHEN 51 

If the origin from which z is measured is designated £ in standard units referred to the 

distribution mean, then p= —ot (3) 
has p) 


where @ is the standard deviation. 
If now the non-central moments of (1) are replaced by standard moments a,,+ where 


. k ’ ’ 
at, = 3: (7) wid air lor, (a) 
the resulting system can be solved to yield 

h=oa, b=0°0, b,=c7, 


b, = 0+ 305-204 (6) 
2 = 9(0 + 603 — 5a,)’ 
where A = —£(1—26,), 
@ = [(1— 3b,) + E{(1 — 4b) wy/2 + £6,}], (6) 


= [a3/2— 2b,(a5—£)]. 


3. GENERAL FOUR-PARAMETER DISTRIBUTIONS 
Doubly-truncated samples. If f(x) is truncated both on the left and on the right, and if the 
origin is taken at the left terminus so that 0 <x <d, where d is thus the truncated range, the 
following system of equations may be obtained in a manner similar to that by which (1) 
was derived: 
h+ 2m,6,.+F,-F = ™); 


7 
hin, + bohm, , +5, m,+5,(b+2)my,,-EF =m. (b= 1,2, e ™ 


where F, =f(0)bo, F = f(d) [by + db, + d%,], (8) 
and m, is the kth moment of the truncated distribution about its left terminus, i.e. 


a . 
m, = [(etteyae with m, = 1. . (9) 


It-should be noted that truncation does not alter the differential equation. The constants 
b in (7) are therefore identical with those in (2), and may be expressed through (5) and (6) 
in terms of moments of the complete distribution. Further details are included in the writer’s 
earlier paper. 

The left and right truncation points in standard units of the complete distribution 
referred to its mean are designated as ’ and £”, respectively, with £” = £’ +d/o. We can write 


F,=0oZ, and F=02,, (10) 
where 
Z, = (G(E')/p] [(1 — 3bq) + £'{(1 — 4b) 9/2 +£'6,}], | 
Zq = (9(E")/p] {[(1 — 8b,) + £'{(1 — 4g) 15/2 + £'b3}] +. (d/or) [0vg]2— 2bq(a1g—£')] + (d/or)*by}. 


(11) 
t In the original Pearson notation, a = ./f,, «= fy. 


4-2 











52 Estimating parameters in truncated Pearson frequency distributions 


The foregoing results follow from (8) when we substitute the values given in (5) and (6) for 
by and 6, and simultaneously substitute for f(0) and f(d) the ratio of standardized ordinate 
to area 


r 
(0) = #E)I(p2) and f(a)= $E")(po), where p= [gat (12) 


If the values given in (10). and (11) for F, and F and those in (5) for h, 6, and by are sub- 
stituted into (7), the resulting system of equations becomes 


m, = o[Z,—Z,+A}/(1— 26,) = 991, 

m,=O((A+Y)g, +9 —(d/o)Z,}/(1—3b,) = 0%9,, 

Ms = O[(A + 2h) 9. + 209, — (d/o7)® Z,]/(1 — 4b,) = 09s, 

M, = T[(A + 3) 93 + 309, — (d/o)® Z,]/(1 — 5b.) = ofg,, 
where 9;, 92,93 and g, are functions of £’, 7, ~, and a, as defined above. It must be remembered 
that 6, A and y as defined in (6) are functions of &’, ~, and «, but not of oc. 


When observed sample moments are equated to corresponding population moments, 
i.e. vy, = m,, where v, is the kth observed moment of the truncated sample about its left 


(13) 


n 
terminus (v. = D2} In) , and these values are substituted into (13), we obtain the system of 
i=1 


estimating equations 
v, = o*79,(6'*, 0%, 03,ag) (r= 1,2,3,4). (14) 


The stars (*) were added to distinguish estimates from parameters. After solving (14) for 
estimates, £’*,o*, af. and af, the mean can be estimated with the aid of (3) as 


wit = — 088". (15) 


In practice, solution of the complete system (14), which applies to general four-parameter 
distributions when samples are doubly truncated, is likely to be rather tedious. These 
equations can, however, be solved by an iterative process similar to one described by Scar- 
borough (1930). First approximations might be obtained by using the writer’s method of 
higher moments (1951) or perhaps even by judicious guessing. If the truncation is not too 
severe, the usual moment estimates. computed using complete sample formulae might 
suffice. 

Singly-truncated samples. When only a single ‘tail’ is missing from the sample, g,,9.,g, and 
g, are functions of &’, a, and a, only. Accordingly, by eliminating 7 between them, the four 
equations of (14) are in this instance reduced to three in number. The labour involved in 
their solution is thereby greatly reduced. 


4. Types III anD NORMAL DISTRIBUTIONS 
Type III distributions. When f(x) is of type III, then 6, = 0, and equations (5) and (6) 
assume the more simple forms 
= -<f, A=-£, 
by = 0%(1+£as/2), 0 = 1+£a4/2, (16) 
6, = oa,/2, y = a,/2. 





13) 


(6) 


(16) 





A. C. CoHEN 53 


Subsequently, it follows from the above results together with equations (13) that for 
doubly-truncated type III samples 


91 = [2,—2,—&'], 
Jo = [(%3/2—§") 91 + (1 +§'xs/2) — (d/o) 2], (17) 
9s = [(%3—§’) go + 2(1 + £’xg/2) 9, — (d/o")® Z,)- 
Similarly, it follows that 
Z, = [(9(&')/p] [1 + $’a5/2], (1s) 
Ze = [$(E")/P] [(1 + £’xg/2) + (d/o) (a/2)). 


Since only the three estimates £’*, o* and a} are required in this case, it is necessary only 
to let r = 1, 2 and 3 in (14) and then solve the three resulting equations. 

With singly-truncated type III samples, the first three equations of (14) immediately 
reduce to two when @ is eliminated between them. This case has been considered separately 
(1950a), and no further comment seems necessary here except to note that the function Z of 
the present paper was given as Z@ in the earlier work. 

Truncated normal distributions. When f(x) is the normal distribution, 6, = 6, = 0, and the 
estimating equations of (14) reduce to identical forms obtained previously using the method 
of maximum likelihood for both singly- and doubly-truncated samples (1949, 19505). 


5. A PRACTICAL APPLICATION 


The practical application of results of this paper can be conveniently demonstrated with 
the same example previously employed to illustrate the method of higher moments. The 
original data consist of weights of 1000 female students (cf. Table 1) as given by Miss Shook 
(1930). Miss Shook computed estimates of population parameters from this sample unde 
the assumption that it was a complete sample from a type III population. For the present 
illustration, a doubly-truncated sample was obtained from Miss Shook’s data by arbitrary 
truncation at 79-95 and 159-95 lb. respectively. Thereby the first and the last six cells of the 
grouped data were eliminated, and the truncated sample thus formed consists of 981 
observations within the interval bounded by the above terminal points. 

The first three sample moments about the left truncation point both before and after 
making appropriate corrections for grouping errors are given below: 


Uncorrected moments Corrected moments 
n= 37-833843 y= 37°833843 
NMg= 1660-06626 Vg= 1651-73293 
Nz = 81,227-1916 Vv, = 80,281°3455 


Grouping corrections which hold only when high contact conditions at the terminal points 
are satisfied are not applicable. Those employed here are due to Craig (1937), which he 
describes as Sheppard’s corrections for discrete data, derived under conditions which do 
not demand high contact. In justification of their use in the present instance, note that 
Miss Shook’s original raw data may be viewed as consisting of discrete measurements, 
each recorded to the nearest ;/5 lb. with grouping accomplished by placing sequences of 
100 consecutive values of these discrete values into successive 101b. intervals. When final 











54 Estimating parameters in truncated Pearson frequency distributions 
graduations are compared with observed data, the foregoing contentions seem to be 
further substantiated. 

Since our sample is doubly truncated from a type III population, the required estimates 
can be computed from the first three equations of (14). In the present instance, however, 
it seems preferable as a matter of practical convenience to turn back to the first three 
equations of (7) with b, = 0. From these we compute estimates of h, by and 6,. Subsequently 
we calculate estimates of ’, o and a, using the relations 

GT =V(bot+b,h), a3=2b,/o and £' =—h/o, 
which follow from equation (16). 
On equating observed sample moments to corresponding population moments, estimating 
equations for the truncated sample under consideration become 
h* 37-833843 — Ft + F*, 
bf + 37-833843b7 = 1651-73293 + 80F'* — 37-833843h*, 
75-667686b§ + 3303-465860f = 80281-3455 + 6400F'* — 1651-73293h*. 


Starting with first approximations £; = —2-5, o,= 15 and a, = 0-35, which were 
calculated as usual moment estimates, neglecting the fact that our sample is actually 
truncated, we employ the iterative method described by Scarborough (1930) to improve 
these results. After seven iterations, + we obtain £’* = — 2-43, 07* = 15-85 and af = 0-61 which 
are correct to the number of significant digits given. Using these values, it follows from (15) 
that 4,* = 38-48 and subsequently M* = 79-95+ u;* = 118-43]b. (mean measured from 0). 

Although convergence of the iterative process in this instance is not as rapid as might 
be desired, the number of iterations required was increased somewhat as a consequence 
of the rather wide error in the first approximation to «, (0-35). When calculations were 
repeated using as first approximations, a, = 0-66, o, = 16 and £; = —2-42 which were 
obtained as higher moment estimates of population parameters, the iterative process led to 
the same final estimates as before, after only two iterations instead of seven. 

A graduation of the sample data based on the estimates obtained here was performed with 
the aid of Salvosa’s Tables (1930), and these results along with Miss Shook’s original gradua- 
tion based on estimates computed from the complete sample are included in Table 1. Also 
included in this table is a graduation taken from the writer’s earlier paper (1951) which was 
based on higher moment estimates from the same doubly-truncated sample considered here. 

It is observed that graduated frequencies based on estimates computed from the truncated 
sample are in closer agreement with observed frequencies than those based on complete 
sample moment estimates. This may be due largely to the fact that complete sample moment 
estimates are unduly affected by extreme observations which are eliminated from the 
truncated sample. At least it has been the writer’s experience to date that moment estimates 
of long-tailed Pearson distributions are improved by truncating the sample tails, and 
subsequently calculating estimates either by the method of this paper or by the higher 
moment method of the earlier paper. Although the graduation based on estimates computed 
from the first three truncated sample moments by the method of this paper does not appear 
to be appreciably better than the higher moment graduation based on the first five moments, 
the present estimates would be expected to exhibit smaller variances as a consequence of not 


{ An illustration of one of these iterative steps is given in the final paragraph of this section. 





ing 





A. C. CoHEN 55 


using the higher order moments in their calculation. Either set of estimates, for the present 
illustration, would certainly be considered adequate as first approximations on which to 
base iterations to maximum-likelihood estimates. This topic is one which the writer hopes 
to investigate more fully in a subsequent paper. 


Table 1. Weights of 1000 female students 




















Graduated frequencies 
Doubly truncated sample 
Weight Observed Complete 
(Ib.) frequency sample , Graduation 
(Miss Shook’s yo pan based on 
graduation) hi based on estimates of 
igher sample this paper 
moments (first (fret theres 
five moments) moments) 
70-79-9 2 0 0-2 0-2 
80-89-9 16 4 12-7 13-8 
90-99-9 82 102 94-0 93-2 
100-109-9 231 238 214-0 212-1 
110-119-9 248 250 253-9 255-0 
120—129-9 196 184 200-8 202-9 
130-139-9 122 111 120-6 121-2 
140-149-9 63 59 59-4 58-5 
150-159-9 23 29 25-2 24-1 
160-169-9 5 13 9-5 8-7 
170-179-9 7 6 3°3 2-9 
180-189-9 1 3 1-0 0-9 
190-199-9 2 1 0-3 0-2 
200-209-9 1 0 0-1 0-1 
210-219-9 1 0 0-0 0-0 
Total 1000 1000 995-0 993-6 
Total frequency 981 977 980-6 980-8 
in truncated 
range 
M* (Ib.) — 118-74 118-56 118-43 
o* (Ib.) — 16-9175 16-02 15-85 
ay -- 0-976424 0-657 0-61 
Lower limit (Ib.) — 84-09 69-77 66-46 























Truncated sample formed by truncating the complete sample on the left at 79-95 ib., and on the 
right at 159-95 lb. 


A sample of the calculations involved in one cycle of the iterative process will serve to 
illustrate details of the technique. At the end of two cycles, we obtain as third approxima- 
tions (a), = 0-50, 7, = 15-5 and £; = — 2-44. Since &” = £’+ d/o, and since for the example 








56 Estimating parameters in truncated Pearson frequency distributions 


under consideration, d = 80, it follows that £3 = — 2-44+80/15-5 = 2-72. From Salvosa’s 
tables, we read 


2-72 
$(— 244) = 0-005050, (2-72) = 0-017913 and p= | o(t) dt = 0-989868. 
—2-44 


On substituting the above values in equations (18) we compute: 


0-005050 
Z, = E44 [1 + (— 2-44) (0-5)/2] = 0-00199, 


0-017913 
and Z,= |S oatess [1 + (— 2-44) (0-5)/2 + (80/15-5) (0-5/2)] = 0-0304, 


Finally, from equation (10) we have as third approximations, 
F, = 15-5(0-00199) = 0-03 and F = 15-5(0-0304) = 0-47. 


These values are rounded off to 0 and 0-5 respectively, and the estimating equations for the 
present cycle of calculations then become 


h= 38-334, 
by+37-834b, = 241-416, 
by + 43-658), = 266-482. 


On solving for 6, and 6, we obtain b, = 4-304 and by = 78-577. Subsequently we calculate 
fourth approximations 


Oo, = J{78-577 + (4-304) (38-334)} = 15-6, 
(ag), = 2(4:304)/15-6 = 0-55, 
£4 = —38-334/15-6 = — 2-43. 


Using these values as new (fourth) approximations, we repeat the process as required until 
the desired accuracy is achieved. As stated previously, seven iterations were necessary to 
compute the final estimates given for the present example. In the subsequent iterations, 
more decimals were carried and the rounding off of F, and F was not quite so drastic as in the 
cycle illustrated above. 


6. CONCLUDING REMARKS 


Computations for applying the method of this paper to practical problems can be carried 
out with the aid only of ordinary tables of areas and ordinates of appropriate Pearson curves. 
However, for any routine applications it would be extremely helpful to have tables of F, and 
F and possibly also of g,, 92, g3 and g, for the principal Pearson distributions. 

Asymptotic variances of estimates given in this paper, although somewhat unwieldy in 
form, are of the order of (1/n) and can be approximated by the so-called delta method as 
described by Cramér (1946). Further investigation of the small sample variances of estimates 
of this paper remains to be undertaken. 





oO —_ 


Orv 


28 





A. C. CoHEN 57 


REFERENCES 


CouEn, A. C. JR. (1949). On estimating the mean and standard deviation of truncated normal dis- 
tributions. J. Amer. Statist. Ass. 44, 518. 

Conen, A. C. JR. (1950a). Estimating parameters of Pearson type III populations from truncated 
samples. J. Amer. Statist. Ass. 45, 411. 

Conen, A. C. JR. (19506). Estimating the mean and variance of normal populations from singly 
truncated and doubly truncated samples. Ann. Math. Statist. 21, 557. 

Couen, A. C. JR. (1951). Estimation of parameters in truncated Pearson frequency distributions. 
Ann. Math. Statist. 22, 256. 

Craia, C. C. (1937). Sheppard’s corrections for a discrete variable. Ann. Math. Statist. 7, 55. 

Cramér, H. (1946). Mathematical Metiods of Statistics. Princeton University Press. 

FisHER, R. A. (1931). Properties and applications of Hh functions. Introduction to Mathematical 
Tables, 1, xxvi-xxxv. Brit. Ass. Adv. Sci. 

Lrg, Aticx (1915). Table of Gaussian ‘tail’ functions when the ‘tail’ is larger than the body. Biometrika, 
10, 208. 

Pzarson, Kart & Lez, AticE (1908). On the generalized probable error in multiple normal correla- 
tion. Biometrika, 6, 59. 

Satvosa, Luis R. (1930). Tables of Pearson’s type III function. Ann. Math. Statist. 1, appended. 

Scarsorouas, J. B. (1930). Numerical Analysis, pp. 191-3. Johns Hopkins Press, Baltimore, Md. 

SHook, B. L. (1930). Synopsis of elementary mathematical statistics. Ann. Math. Statist. 1, 14. 








[ 58 ] 


A PROBLEM OF INTERFERENCE BETWEEN TWO QUEUES 


By J. C. TANNER 
Road Research Laboratory, Department of Scientific and Industrial Research 


1. INTRODUCTION 


In most of the queueing situations which are amenable to mathematical analysis, the 
queueing units (customers in a shop, etc.) are all trying to achieve the same object, for 
example, in the case of a shop, to be served. Usually, the only differences between successive 
customers are their times of arrival, and perhaps the time they take to be served. Congestion 
then arises because only a limited number of customers can be served at a time. A feature 
of many road-traffic queueing situations, however, is that there are a number of different 
‘populations’ of ‘customers’ (in this case vehicles), and congestion arises not so much 
because too many vehicles want to do the same thing at the same time, but because a number 
of vehicles want to do different things which cannot all be done simultaneously. For 
example, at a cross-roads, less congestion is usually caused by a number of vehicles pro- 
ceeding in the same direction on the same road than by a smaller number of vehicles, half 
on one road and half on another. 

The problem to be considered here is one of the simpler ones of the above type. It concerns 
the delays that occur when two opposing streams of vehicles are trying to pass along a length 
of road only wide enough for one vehicle at a time. The mathematical difficulties arising 
from the necessity of considering the behaviour of two queues simultaneously are, not 
surprisingly, rather great, and the problem as first formulated has not been completely 
solved. Nevertheless, the results obtained have applications to several other road-traffic 
problems and these are briefly discussed. There may also be applications to other situations 
involving the flow of discrete objects, for example, articles produced in a factory. 


2. STATEMENT OF THE PROBLEM 


The physical situation associated with the statement and analysis of the problem has been 
chosen as one which is fairly easy to visualize. It is not necessarily the most useful applica- 
tion of the results. 

We have a road of which a certain length AB is only wide enough for a single traffic lane 
and outside which there is free flow in both directions. Vehicles, referred to as V,’s, arrive 
at A at random at an average rate q, per unit time, wishing to pass from A to B. Similarly, 
vehicles V, arrive at B at random at an average rate g, per unit time, wishing to pass from 
B to A. A vehicle V, will pass straight into AB on arrival at A if there is no opposing vehicle 
V,in AB, and if no other V, has entered AB during the previous time £,. Otherwise it will wait 
until both these conditions are satisfied. Similarly, a V, will pass into AB at B as soon as 
there is no V, in AB and no other V, has entered AB within the previous time £,. We further 
assume that each V, takes exactly a time a, to pass through AB, and that a V, takes exactly 
@,. It is assumed that all vehicles travel at constant speed and that starting and stopping 
times are negligible, though these can to some extent be allowed for by adjusting the 
constants a, and a. 














J. C. TANNER 59 


The problem is to find the average delay w, to vehicles V, and w, to vehicles V, when a 
stationary solution exists, i.e. when the rate at which vehicles can pass through A Bis greater 
than the rate at which, in the long run, they will arrive, so that queues of vehicles will always 
clear themselves, given sufficient time. Other aspects of the process might be of interest, 
for example, the frequency distribution of the waiting times, but the average delays are 
considered to be more important and are easier to find. 

There are three distinct cases which should be considered, according to the relative 
magnitudes of the «’s and f’s: 


(i) @>f;, a>fe, 
(ii) a >f), a<f, (or a,</,, a> Ay), 
(ili) a,<f,, a<Ay. 


Cases (ii) and (iii) appear to have fewer applications and present mathematical difficulties 
of rather greater complexity than case (i). We therefore confine attention to the latter case. 

The physical situation associated with case (i) is as follows: If there is a V, in AB, then 
V,’s will not have an opportunity to enter AB until there are no V,’s in AB or waiting at A, 
since there will, until then, always be at least one V, in AB. Similar considerations apply 
for the other stream, and we may say that at any moment either V,’s or V,’s or no vehicles 
have control of AB, according to whether it is occupied by V,’s, V,’s or is empty. In cases (ii) 
and (iii), a queue of vehicles may not be able to clear in a single ‘ block’; vehicles in the other 
stream may be able to interrupt the flow, even though the vehicles in the first are at their 
minimum separation. 

The concept of regeneration points (Kendall, 1951, p. 184) gives an indication of why 
cases (ii) and (iii) appear to be less amenable to mathematical treatment. In case (i) there 
are various convenient sets of regeneration points; for example, the beginnings of all 
blocks of V,’s. In case (ii), the only comparable set appears to be the beginnings of all 
blocks of V,’s, while in case (iii) there are no comparable sets. The method to be used for 
case (i) makes use of two sets of regeneration points and so is inapplicable to cases (ii) 
and (iii). Of course, any defined set of instants forms a set of regeneration points in any 
of the three cases if the precise position of every vehicle in the system is included in the 
available description of the state of the system at any time, but such a set would not be 
of practical use. 


3. ANALYSIS OF CASE (i) (@, >, &> fs) 


We carry out the analysis in the most general form, though the equations which arise have 
only been solved when f, = 0 or £, = 0. 

At any time, as just mentioned, AB is occupied by either a V, or a V, or is empty. Periods 
when these three possibilities hold will thus alternate in a manner which must be deter- 
mined. We then find the mean delays in terms of the distributions of the lengths of these 
periods. 

3-1. Notation and definitions 


A VB, (V, block) is a period when AB is controlled by V,’s, whichis immediately preceded 
and followed by periods when this is not the case. V.B,’s and V B,’s are similarly defined, the 
latter being periods when AB is empty. 











60 A problem of interference between two queues 


to, t, and ¢, are lengths of VB,’s, VB,’s and VB,’s. Each t; (i = 0, 1, 2) has a distribution 
whose moment generating function (m.g.f.) H(e4?) with respect to an argument 7’ is M,(7)). 

r, and r, are the numbers of V,’s and V,’s waiting at the beginning of VB,’s and VB,’s 
respectively. 

t,|7,, with m.g.f. M,(7'|1,), is the length of a VB, conditional upon the number of V,’s 
being r, at its beginning. ¢, | r. is defined similarly. 

Po, $; and ¢, are the relative frequencies of V.B,’s, VB,’s and VB,’s, with $+ $,+¢2 = 1. 

9:3 (t+) is the probability that a VB, chosen at random was preceded by a VB;. 

Other symbols: 


M,(— 42) = M,, M,{ —4;) = M,, 
6,=%+%-“M,, 62 = 91+ 92—%2M,, 

y = 4,+M,- UM, 
€, = [e1-Ad4— 1] /q,, Eg = [ea Aa) 2 — 1] /qp. 


v, is a function of 7’, the smaller real root of 
Vy = Artr-D+ ALT, 


Similarly Vo = CPadaa-D+ ArT, 


Pi=fidr Po = bed- 


It may be shown that the condition for a stationary solution to the problem to exist is 
P,+p2< 1, and it is assumed throughout that when this is satisfied, all the series which arise 
converge and the m.g.f.’s of all the distributions exist. Proofs of these assumptions could be 
given, but they are thought to be sufficiently intuitive. 


3-2. Outline of method 

Stage 1: Find the q,; and the ¢, in terms of M, and MW, (and the basic constants). 

Stage 2: Find the distributions of r, and r., also in terms of M, and M,, by considering the 
block immediately preceding. 

Stage 3: Find the distributions of t, and ¢,, first for fixed r, and r,, and then averaged over 
the distributions of these just obtained. 

The assumption that the process is stationary leads to a pair of equations involving 
M,(T) and M,(T7') which contain the information necessary to obtain these functions. 

Stage 4: Find the mean delays as functions of the distributions of t, and t,. 


3:3. Stage 1 
Define p;,; as the probability that a random VB; is followed by a VB,. Suppose we have 
a VB; of length t,; then the probability that it is followed by a VB, is the chance that no V, 
arrived during the interval t,, i.e. it is e~%1. Thus p,) = H(e~%4), taken over the distribution 
of t,. Hence 
Pio = M,. 


Similarly Poo = My, Pye=1-My, py = 1-M,. 


N q2 
Also = 4 = —., 
ss 1+ Pee 1+ 42 








be 


he 


er 


ng 


ve 


on 





J.C. TANNER 61 


It is easily seen that Po = Pio P1 + Poo Gs = Yor Pi + o2 $2: 
1 = Por G2+ Poi Po = 1292+ 4100: 


$2 = Por Po+ Piz = 1200 + G21 Pr, 
and these equations, together with q, + dz; = 1, etc., give 


o = (91+ 92)/[6, + n+ (G+ of] 


$; = 92/[6, + 6,+ (41+ 9)], (1) 
Pe = 9,/[8, + 6.+Y(G1+%)I, 
Yo = 1Y/%e, Yor = (1— M,) 3/63, 
Yo2 = Y27/41, die = (1— M)) 6,/6,, (2) 
Kio = My 92/{(91+%2)}, 20 = Meds/{y(Gi + 92)}- 


3-4, Stage 2 
We consider the distribution of r,. The chance that a random VB, is preceded by a VB, 
is Yo, and in this case r, = 1. Consider now a random VB, of length t,. The chance that just 
r V,’s arrive during this VB, is 
e-tia(q,t,)"/r!  (r = 0,1, 2,...). 
Averaged over all V B,’s this is Efe-%4(q, t.)"]/r!, 


where the expectation is taken over the distribution of i,. Hence, considering only those 
VB,’s which are followed by V B,’s, the probability of just r is 


= Bleh(qytyYI{1—EleA]} (r = 1,2,3,...). 
The m.g.f. of this probability distribution is 
MT’) = Efe~% 2 (qatee*y/ryi( — M,) 


= Efe-14s(e%'se” -]l VC 1- M,) 
= [,(9,(e? — 1)) — My]/(1 - M,). 
We thus have the m.g.f. of the distribution of r, as 
Gore” + Garl-Ma(9,(e* — 1)) — Ma}/(1 — Ma) = gy ye" /82 + 8,[MA(9,(e7 — 1))—M}/62, (3) 
and similarly that of r, as 
qaye™ /8;, + da[My(92(e7 — 1)) -- My))4. (4) 


3-5. Stage 3 
We require the following lemma: 


Lemma. We have a queue in which the units arrive at random, at arate q per unit time, 
each unit taking a time f to be served. Initially there are r units in the queue, and the first 
one starts to be served. Then the m.g.f. of nf, where n is the number of units which will be 
served before the next time there are none being served, is v’ where vis the smaller real root of 

py = eAwy—-I 67, (5) 


When r = 1, the problem is that of finding the distribution of the number of individuals 
in all generations arising from a single ancestor, to use population mathematics terminology, 
and has been studied by several writers (e.g. Harris, 1948; Good, 1948; Kendall, 1951). 








62 A problem of interference between two queues 


A proof will not be given here. The result for a general value of r follows immediately since 
n can be regarded as the sum of r independent components, the descendants of the r initial 
units. 

Borel (1942) has given the frequency distribution of n for the case r == 1. It is of interest 
to note, although we do not need it here, that his method applies almost unchanged for a 
general value of r, the probability p,, of n individuals being 


Py = e764 Bq)" r.n®7-1/(n—r)! (n=r,r+1, ...). (6) 


The m.g.f. of nf can quite easily be found from the above expression for p,,. 
We now apply this lemma to prove 
vi e(%—Ay)(T—-a) (1 ve T) 








ae | = Qy — T 14 9Q, +4 2 FE —-w)’ (7) 
o a p"2 e(%2—Ba)(T a) (do ‘aie T) 
ieee | na) = Ya — T — Vado + Vo qqe%2-F2(T—a)° (8) 


We consider (i). ¢, | r, is the time that elapses from the initial state in which r, V,’s are waiting 
till there are no V,’s in AB. This time can be split up into the following components: 

(a) an ‘r, wave’;* 

(6) an equal number, A, of unit waves, and gaps whose distributions are negative- 
exponential between 0 and a, —; A is a random variable; 

(c) a final time a, — fA). 

These components are distributed independently of each other, so we determine the 
m.g.f.’s of each and take their product to give M,(T | r,). 

For (a), the m.g.f. is y, by the lemma. 

The gaps in (b) have the distribution 


a—fy 
ne-wiat | qe adt (0<t<a, —f,), 
& ene wt 
whence their m.g.f. is Q—T emAa—] * 


The ‘waves’ in (b) have m.g.f. v,, by the lemma. The probability P(A) that we have just A 
gaps and waves is the probability that the first A gaps following waves are all less than 
a, — 8, and that the next one is greater than a, — £,. Thus 


P(A) = [1l- e1-Ad Rerta—Ay%, 





a,— Ay 
since qe dt = 1—e“s-Ad% (A =0,1,...). 
0 


The lengths of the gaps and waves are all independent of each other and of the number of 
them, and so their sum has m.g.f. 





—f,(T-q) — JA —f,) 
> {1 as eA Petar-AG yh 1 ef%1—-A(T—-a) — ] 2 e(4:-Aa 
A=0 . a T eta—Ava amy | 1 


nh @—Ay(T—2) — ] 
ar . . J 





Hence we finally have 

es ve e(%—Ay)(T—a,) (4- T) 
MT |") = Gy — T 049, + 4.9, EPPO * 
The expression for M,(T |r.) may be derived likewise. 


* This is defined as the nf of the lemma, and is f# greater than the time that elapses until there are 
no V,’s waiting. 











he 


LA 


an 





J. C. TANNER 63 
We next average M,(T | r,) over the distribution of r, given by (3). We have 
M,(T) = E(U,(T |1,)) 
can es 
Qi — T 049, + Yq 2 PP) 
a e%—A(T 2) (q, — T) GY oer, , 91 logy, 
~~ T-Gy+h A(T a) [42 i 3, {M,(a,(e"*"*— 1))— M3 | : 
Hence, finally, 
8o[91 — T — 91 + 91% PFW) M,(T) 


= es ATW) (g, — T) [q, yr, + 8,{4(9,(Y,— 1))— Mg}. (9) 








Similarly, 
8:[2— T — dao + qo¥ge%s Fa? W)] M,(T’) 
= eta Ad(P 9) (qq — T) [G22 + S2{ My (G2(Y2— 1)) — M3). (10) 
The solution of these equations in the unknown functions M,(7') and U,(T) to give the 
moments of the distributions of t, and ¢, has not been achieved in the general case. The 


solution has, however, been obtained when one of the f’s is zero, and this is considered 
later. We note, however, that in the general case, expansion of (9) and (10) leads to 


(1 — py — Pe) bg 43 (t,) = 526,(1 — Pg) + 8,699, + Pry 


: (11) 
and (1 — py — Pe) 81 44(te) = 52€1P2+ 4, €9(1—p,) + poy. 


3-6. Stage 4 
We now derive the mean delay to a V, by considering the average number of V,’s waiting 
at a random instant. Suppose that the average number waiting during a VB, is m, and 
during a VB, is m,. Then the average number waiting at a random instant is 





ie a ae Prfa (ty) + Mo bor (te) (12) 
Pol (1+ 42) + Pills) + Popate)’ 
since $o/(q;+42), --. are the relative amounts of time in V B,’s, .... The mean delay, which 
we denote by w,, is then 
sis) 
0,=-. (13) 
"1 


In a VB, of length ¢,, the average number of V,’s waiting will be 4¢,/,, since the expected 
number at a time ¢ after the beginning of the block is g,¢. Hence, averaging over the dis- 


tribution of t,, we obtain 
_ E(dqite-te) $41 Ha(te) 


ma Bh) Hille) 
It thus remains to find m,. 

We find m, first of all in terms of y(r), the expected total delay to V,’s during an r-wave, 
as previously defined. Consider a VB, which starts off with an r-wave. As before, we can split 
the block into a number of parts, during which the expected total delay can be written down 
immediately, as follows: 

(i) During the r-wave, expected total delay = y(r). 

(ii) During the unit waves, expected total delay = (e1-40% — 1) y(1). 
(iii) During the gaps, expected total delay = 0. 
(iv) During the final «, — £,, expected total delay = 0. 

















64 A problem of interference between two queues 
Hence, adding together for the whole block, 
expected delay total = y(r) + (e%1-40% — 1) (1). (14) 
Averaging over the distribution of r, we obtain 
expected total delay = y(r) + (e%-Ad% — 1) (1). 

Also, average length of block = ,;(t,). 
m, is now obtained as the ratio of the last two expressions. 

To find y(r), we break up the r-wave into ‘generations’, as follows (the procedure is that 
followed by Borel, 1942): While the initial r vehicles are departing, suppose a further a 
arrive; and that while the a are departing, a further 6 arrive; and so on. The number of such 


generations can be indefinitely large, and the chance that the sizes of successive ones are 
r, a,b,c, ... is evidently 
—pir @ e—p14 6 e—p1b c 
P(a,6,...|r) = . tar) See ae vine 
a! b! c! 
em Prt Gato+ ... r2qhbe 


= alblel —, where 6 =p,e-”1 (15) 








(0° is to be taken as unity). 
In this case, the expected total delay is 


$8,[r(r —1) + a(a—1)+6(6—1)+...]+ 436, [ra+ab+be+...]. (16) 

The terms in the first bracket arise from delays to vehicles while vehicles in their own 

generation are departing, while those in the second arise from delays while the previous 
generation is departing. 

The detailed algebra required to obtain the expectation of the terms in (16), averaged 

over the distribution given by (15), will be omitted, but may be obtained by noting, 

for example, that the probability of a particular a, b, c, d and e, followed by unspecified 


scale e-Pr Hatbterd péraghhectge/(alb!c!die!). 
We find that, for example, 

E(e(e—1)) = r(pi +... +p) +p}? 
and E(de) = r(pi+... +p?) +r%p?. 
We finally obtain the expected total delay as 


es, 1) e 
Y(r) = $A, —p; 2 . l = |- 





Thus v1) Bi et 


Yr) = | pa slit) +3 a warS ~ wt). 
Alternatively, it may be shown quite simply that the joint m.g.f. of a,b,c,..., defined as 
E(erh+igtelst...), 


is equal to exp (—p,7 +70 exp (t, + 0 exp (t, + 0 exp (...)))). 


(This is closely related to expressions given by Good, 1948.) Expansion of this function as 
a power series in ¢,, f,, ... then leads to the same value of y/(r). 





at 


ch 





J. C. TANNER 65 
Hence, averaged over the distribution of r, the expected total delay to V,’s during a VB, is 


a CE ile) + wale) | + [eo — 1) 48, (17) 


—Pp;)*” 
Expansion of the generating function (3) gives 

PAT) = G17/82+ 4819141 (te)/52 
and Halt) = 91/82+ 84[91 Hi (te) + 93 H9(te)]/S2- 


Substituting these expressions in (17) and the expression in (11) for ~;(é,) then gives the 
expected total delay as 








401% ; 
(Tp) (1 — py =) by Pal +6192 + 608s) + 54(1 — Ps — Pa) Malte) 
Collecting our results together, and substituting in (12) and (13), we find that the mean 
delay to V,’s is 
wD a 4 FiMa(te) (1-—p,—- “all 
“a * 3(1—p,) 18 
* 2(1—py) [Arm Y + 6,0_ + 66 (18) 





Similarly 52 /49(t1) G— “Pas 5. (19) 


1 
a at —py LPP Y + 6,6, + 66 
This completes stage 4. 


4. EXPLICIT SOLUTIONS FOR SOME SPECIAL CASES 


As already mentioned, the analysis of §3 leads to equations for the mean delays whose 
solutions have not been obtained. If, however, we assume one of the /’s to be zero, an 
explicit solution can be found, and we now consider this case. 


4-1. Special case B, = 0 
In this case, equation (10) reduces to 








os q.—T 
M,{(T) = — Teta-2” (20) 
=a 
so that M, = M,(—4) = ciate (21) 
Substituting in (9), and putting 7’ = —q., leads on simplification to 
+4 
M, = , (22) 
pire Ya— 93(w — 1) e%sts taxa) 
qy + 1 -Ad Gita») itwoiews 
where w is the smaller real root of 
@ = e7~FilQitds—-41) , 
Expansion of equations (9) and (20) leads to 
659190(1 — P1)® a(t) = 28, 07(1 — py) 91(1 + 292) (Eg — a) 
+ 26392(1 — p,)? (1 + €,9)) (6, -— 0, + Ay) (23) 


+ Gopy(¥ + 8; €y + 5:1) (0, + 29, €, — 29, €,P;) = P, 


Ya/4o(te) = 2(1 + €g9e) (Eg — Hq) 
Biometrika 40 5 








66 A problem of interference between two queues 
Hence, by (18) and (19), 


i, = pi + 8; (€292 + 1) (€g— ag) 
+ 2q,(1—py) "gay + 6,84 €29;) 











pi 6, e%2%a(e%2% — a9 qq — 1) 
" + 24 
2g,(1—p;) Gy + 18, + €244) tg 
oF 1 P ii 
and W, = i—p,"aay teh tes,’ where P is given by (23). (25) 


These expressions for w, and %, are explicit formulae, since the only quantities they 
contain, apart from the basic constants, are y, 6, and 6,, which can be expressed in terms of 
these by means of the relation (21) and (22). However, they are obviously unsuitable for 
tabulation and it is not thought that any important simplification can be made to them. 
This special case will therefore be left in this rather unsatisfactory state. 


4-2. Special case Bp, = f, = 0 
From (24) and (25) we have 
— _ 9¢%9%s(e%% — aq — 1) 











w, = 
, qa(Y + €152 + €24;) 
a. = 6,e%%(e1% — a9, — 1) 
. Gilly + €,52 + 6244) 
Using (21) and (22) we find 
Y (91 +92) (qe +49 + g, e%a(t1+09)) 


~ (Ga+ 91 6%#%*29) (9, + qq e%GrFOD)’ 


&, = (G1 +92) 9207+») a (G1 +92) Gy e7 ta) 














Ti + qgesata) ° ers © 
whence 
ne e%1%2(qo + gy 674% +49) e%1% — aq,—1 26 
La qe e%%a(e(tr+29)@1 — 3%: + 1) + A e%2%1 (elt +29) 4s — ads 4 1) Ie (26) 
peers e%291(q, + qae%1"tas)) enh —a,g,—1 
and _ OA e%% (elt ts) ds —e%%s 4+ 1)+ qo e%192(elerta9) ay — e%t + 1) q; (27) 


Table 1 below tabulates #, for the case a, = a, = 1, forg, = 0-0 (0-5) 2-0, g, = 0-0 (0-5) 2-0. 


Table 1. Values of W, for 8, = f, = 0,a,=a,=1 
































AFT 
0-0 0-5 1-0 1-5 2-0 
| 0-0 0-00 0-00 0-00 0-00 0-00 
0-5 0-30 0°27 0-24 0-21 0-17 
qs 1-0 0°72 0-61 0-53 0-45 0-37 
1-5 1-32 1-11 0-97 0-84 0-70 
2-0 2-20 1-87 1-67 1-47 1-27 

















J. C. TANNER 67 


4-3. Special case a, = f, = 0 
In this case we have M,=1, 6=q, &=0, y=1. 


Hence by (24) and (25), 
@, = eG ! (28) 
2q,(1—p,)’ 


1 
W, = 21 —p,)*qige Pa 3p} + 2p} + 2g, €,(1 —p,) — 29, %4(1 — py)?] 


i.e 1, =pit 2pi 
—a,—-— +. 
~ qa(1— py) Pi) > Gy 2q(1—p,)* 
This case corresponds to a physical situation in which vehicles in stream 2 do not affect 
those in stream 1 in any way. W, is therefore the mean delay for a simple queue with arrivals 
q, per unit time and service time /,. 


Table 2 below tabulates #, for % = 1, £, = 0-0(0-1) 1-0, g, = 0-0(0-2) 2-0, for values 
of #, less than 10. 


(29) 


Table 2. Values of W, for the special case a, = f,= 9, a, = 1 





















































A, 

| 0:0 0-1 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 1-0 
0-0 | 0-00 | 0-00 | 0-00 | 0-00 | 0-00 | 0-00 | 0-00 | 0:00 | 0-00 | 0-00 | 0-00 
0-2.| O11 0-11 0-11 O11 | O11 0-12 | 0-12 | O13 | 0-13 | 0-14 | 0-16 
0-4 | 0-23 | 0-23 | 0-23 | 0-24 | 0-25 | 0-27 | 0-30 | 0:34 | 0-38 | 0-46 | 0-56 
0-6 | 0-37 0-37 | 0-38 | 0-40 | 0-43 | 0-49 | 0-57 0-70 | 0-92 1-27 1-88 
GQ | 08 | 0-53 | 0-54 | 0-56 | 0-60 | 0-67 0-83 1-04 1-37 | 2:37 | 4-40 | 10-00 

10 | 0-72 | 0-73 | 0-76 | 0-84 | 0-99 1-30 | 1:96 | 3-59 | 8-91 — 4 

1-2 | 0-93 | 0-95 1-01 1-15 1-44 | 2:15 | 419 _ — 9 54 

1-4 1-18 1-21 1-30 1-54 | 2-12 | 3-86 —_ _ 3 4 zt 

1-6 1-47 1-51 1:66 | 2:05 | 3-19 | 8-33 —_ 7 4 s ” 

1-8 1-80 1:86 | 2-08 2°73 | 5-10 — - * . . - 

2-0 2-20 | 2°27 2-60 | 3°56 | 9-20 . , 5 “é 4 . 

i 








* p,2>1; no stationary solution exists. 


In this special case, it is possible to obtain an expression for the m.g.f. M,,,(7') of the dis- 
tribution of the delay w,. We proceed as follows: 

If a V, arrives in a VB, of length ¢,, its delay will have a rectangular distribution in the 
interval (0,t,). The m.g.f. of this delay will therefore be 


(eh? —1)/t,T 
Averaging over the distribution of t, gives 
Eft,(e? — 1)/t, T)/E[t,] = (,(7) — 1)/T4(4,). (30) 


Now, by (7), and since r, is always unity, we have 
peles-ANT a0) (q, — 7) 





t= as T — vy + 0 qe PF” 
Comex 1 
that t (31) 
sa mh) = op.) a 


5-2 











68 A problem of interference between two queues 


If a V, arrives in a V Bo, it will have zero delay. The probability of this is 
1/q, 1—p, 








W(t) + Iq, Aw? (32) 
since V B,’s and V B,’s are equally frequent. 
From (30), (31) and (32) we obtain 


Teal — 19, +9, —T +049, 6% AV) 


4-4. The special case 8, = a, = f, = 0 
The solution for this case can be derived quite easily from the last section. We obtain 


_ (q— Tea 
M,,(T) beat gy e-2% — Te-uT ’ 


and w, = 0, 
We = (e%% — ag, — 1)/q. 


This case has been considered in more detail by Garwood (1940), Raff (1951) anc ‘Tanner 
(1951). 





(34) 


5. APPLICATIONS 


In this section we consider briefly some particular applications of the preceding analysis. 


5-1. Delays to pedestrians crossing a road 


Suppose a stream of vehicles, flow q, per unit time, is passing a bottleneck where they are 
constrained to pass in a single line, with minimum separation /,. At this bottleneck, pedes- 
trians, who demand a gap a, between themselves and the next vehicle to arrive, are crossing 
the road. Then their average delay is given by (29) and Table 2. This can be seen by noting 
that both for the situation just stated and for the solution of the general problem with 
a,>f,>0, a, = £, = 0, vehicles in stream 1 are unaffected by those in stream 2, while 
vehicles (or pedestrians) in stream 2 wait for a gap a, in stream 1 and do not hold it up. 
(It is assumed that the time taken by a pedestrian to cross the road is less than «,, so that 
vehicles are not held up by pedestrians.) When there is no bottleneck, i.e. 2, = 0, the delay 
is given by (34). 


5:2. Delays to vehicles on minor road at uncontrolled intersections 


If the suffix 2 refers to vehicles using the minor road, and we suppose that these vehicles 
have to cross or merge with just one main road stream, vehicles in the latter having absolute 
right of way, we have to consider the situation a, > 8, >0, a, = 0, 8,>0. (The same type of 
reasoning holds as in the previous paragraph.) This is a specialization of case (iii), as defined 
earlier, and the solution has not been obtained. If, however, the minor road traffic flow is 
small, it will not affect the result much if we put £, = 0, in which case the mean delay is 
given by (29), as for the pedestrian case above. 


5-3. Delays to vehicles at uncontrolled intersections where 
neither stream has absolute right of way 
If we consider just two streams of traffic intersecting at right angles, the mathematical 


problem is that of case (i), «, > 8, >0, a,>£,> 0. The solution for this has not been found, 
but, as in the previous section, we can give the mean delays if £, and £, (or just one of them) 








J. C. TANNER 69 


are negligible, or if one or both of the flows is very light. The delays are then given by (24) 
and (25), or by (26) and (27). 


5-4. Delays to intersecting streams of pedestrians 

There are a number of similar problems under this heading. 

If we have a staircase, passage, doorway, gangway, bridge, etc., on which one-way traffic 
is enforced, we have a problem mathematically identical with the vehicle delay problem in 
terms of which this paper has been formulated. The simplification £, = #, = 0 may more 
often be appropriate here than in vehicle problems. 

There will in practice often be complications in these situations, such as multi-channel 
operation, non-random arrivals, and so forth. An interesting, though no doubt very difficult 
problem, which has its counterpart in the vehicle delay problems, is to find what happens 
if instead of having single-channel operation we have several channels which could be 
operated some in one direction, some in the other, at the same time. Passing from one 
channel to another in the middle of the passage through it might or might not be permitted. 


This paper was prepared as part of the programme of the Road Research Board of the 
Department of Scientific and Industrial Research, and is published by permission of the 
Director of Road Research. 


The author wishes to thank Dr F. Garwood and Mr D. G. Kendall for advice in its 
preparation. 


REFERENCES 


Borst, E. (1942). C.R. Acad. Sci., Paris, 214, 452-6. 
Garwoop, F. (1940). Suppl. J. R. Statist. Soc. 7, 65-77. 
Goon, I. J. (1948). Proc. Camb. Phil. Soc. 45, 360-3. 
Harazis, T. E. (1948). Ann. Math. Statist. 19, 474-94. 
Kenpatt, D. G. (1951). J. R. Statist. Soc. B, 13, 151-85. 
Rarr, M. S. (1951). J. Amer. Statist. Assoc. 46, 253. 
Tanner, J. C. (1951). Biometrika, 38, 383-92. 














[ 70 ] 


TABLES OF THE ANGULAR TRANSFORMATION 


By W. L. STEVENS 


Faculdade de Citncias Econémicas e Administrativas, 
Universidade de Sio Paulo, Brazil 


INTRODUCTION 
In 1922, Fisher suggested the use of the transformation 
6 = arc cos (1 — 2p) 


to facilitate the analysis of data expressed as proportions or percentages. It can be shown 


that, if 
p= x/n, 


where 2 is drawn from the binomial distribution 
{7+(1—7)}*, 


then the quantity of information about 7, contained in 0, is independent of 7. The weight 
to be attached to @, in an analysis of variance, is therefore independent of 7 (or practically 
independent, if we define the weight as the reciprocal of the variance). The use of 6 therefore 
saves us the trouble of making a preliminary adjustment to determine the weights, as we 
have to do, for example, in a probit analysis. 

A slightly different form of the transformation is found by writing 


¢ = 46 = arcsin yp. 


A table of this function, with ¢ measured in degrees to three significant figures, is provided 
in Fisher & Yates (1938, Table XII). A four-figure table, also in degrees, was published by 
Bliss (1937) and reproduced by Snedecor (1946, p. 449). A small table in radian measure 
has been given by Bartlett (1937). The use of radians has the advantage of simplifying the 
expression for the weights. A full discussion of the transformation and a bibliography will 
be found in Eisenhart, Hastay & Wallis (1947). 

Although the existing tables are usually sufficient for most practical purposes, there are 
occasions when one wants a table similar in accuracy to that of the table of probits given by 
Fisher & Yates (1938). In calculating this table, we defined the transformation as 


6 = 50—Aarcsin (1—2p), 
where abt 94 
= 31-62278.... 


This form confers three advantages: 
(i) 6 ranges from 0-327 to 99-673 and therefore has almost the maximum possible 
accuracy for any given number of significant figures. 
(ii) The weight is given by the extremely simple expression, n/1000, where n is the 
number of observations. 
(iii) Like the probit function, complementary values of the function correspond to 
complementary values of the argument, p = 50 % giving 0 = 50. 








“ 


Pall 


yy, ~~ oe 











W. L. STEVENS 71 
Table 1. The angular transformation, 0 = 50-- /1000 arc sin (1 — 2p) 

100p | 0-0 0-1 0-2 0-3 0-4 0-5 0-6 0-7 0:8 0-9 1°83 “@ 25-5 
0 | 0-327 2-327 3156 3-793 4330 4-803 5-281 5-625 5-991 6336] For interpolation, 
1 6-662 6-973 7-269 7-554 7-828 8-093 8349 8597 8838 9-073 use Table IT 
2 | 9-301 9-525 9-743 9-956 10165 — ana ak aes — | 22 43 65 86 108 

10-369 10-570 10-767 10-960 11-150] 20 39 59 78 98 
3 | 11-337 11-521 11-702 11-880 12056 — a = prea — |18 36 54 72 90 

12-229 12-400 12-569 12-735 12-900] 17 34 50 67 84 
4 | 13-062 13-222 13-381 13-538 13-693 — ics ee oa — |16 32 47 63 79 

13-846 13-998 14-148 14-297 14-444] 15 30 45 60 75 
5 | 14-590 14-734 14-877 15-019 15-160 15-299 15-437 15-574 15-710 15845 | 14 28 42 56 70 
6 | 15-978 16-111 16-243 16-373 16-503 16-632 16-759 16-886 17-012 17-138] 13 26 39 52 64 
7 | 17-262 17-385 17-508 17-630 17-751 17-872 17-991 18-110 18-229 18346 | 12 24 36 48 60 
8 | 18-463 18-579 18-695 18-810 18-924 19-038 19-151 19-263 19-375 19-487] 11 23 34 46 57 
9 | 19-598 19-708 19-817 19-927 20-035 20-143 20-251 20-358 20-465 20-571 11 22 32 43 54 
10 | 20-676 20-782 20-886 20-991 21-094 21-198 21-301 21-403 21-505 21-607| 10 21 31 41 52 
11 | 21-708 21-809 21-910 22-010 22-109 22-209 22-308 22-406 22-504 22-602| 10 20 30 40 50 
12 | 22-700 22-797 22-894 22-990 23-086 23-182 23-277 23-373 23-467 23-562 | 10 19 29 38 48 
13 | 23-656 23-750 23-843 23-937 24-030 24-122 24-215 24-307 24-399 24-490| 9 19 28 37 46 
14 | 24-582 24-673 24-763 24-854 24-944 25-034 25-124 25-213 25-302 25:391| 9 18 27 36 45 
15 | 25-480 25-568 25-656 25-744 25-832 25-920 26-007 26-094 26-181 26-267| 9 17 26 35 44 
16 | 26-354 26-440 26-526 26-611 26-697 26-782 26-867 26-952 27-037 27-121| 9 17 26 34 43 
17 | 27-206 27-290 27-374 27-457 27-541 27-624 27-707 27-790 27-873 27:956| 8 17 25 33 42 
18 | 28-038 28-120 28-202 28-284 28-366 28-447 28-529 28-610 28-691 28-772| 8 16 24 33 41 
19 | 28-852 28-933 29-013 29-094 29-174 29-254 29-333 29-413 29-492 29:572| 8 16 24 32 40 
20 | 29-651 29-730 29-809 29-887 29-966 30-044 30-122 30-201 30-279 30:356| 8 16 23 31 39 
21 | 30-434 30-512 30-589 30-666 30-744 30-821 30-898 30-974 31-051 31-127] 8 15 23 31 38 
22 | 31-204 31-280 31-356 31-432 31-508 31-584 31-660 31-735 31-811 31:886| 8 15 23 30 38 
23 | 31-961 32-036 32-111 32-186 32-261 32-335 32-410 32-484 32-559 32-633| 7 15 22 30 37 
24 | 32-707 32-781 32-855 32-929 33-002 33-076 33-149 33-223 33-296 33-369| 7 15 22 29 37 
25 | 33-442 33-515 33-588 33-661 33-734 33-806 33-879 33-951 34-024 34:096| 7 15 22 29 36 
26 | 34-168 34:240 34-312 34-384 34-456 34-527 34-599 34-670 34-742 34:813| 7 14 22 29 36 
27 | 34-884 34-956 35-027 35-098 35-169 35-240 35-310 35-381 35-452 35°522| 7 14 21 28 35 
28 | 35-593 35-663 35-733 35-804 35-874 35-944 36-014 36-084 36-154 36-224| 7 14 21 28 35 
29 | 36-293 36-363 36-432 36-502 36-571 36-641 36-710 36-779 36849 36-918| 7 14 21 28 35 
30 | 36-987 37-056 37-125 37-193 37-262 37-331 37-400 37-468 37-537 37:605| 7 14 21 27 34 
31 | 37-674 37-742 37-810 37-878 37-947 38-015 38-083 38-151 38-219 38-287| 7 14 20 27 34 
32 | 38-354 38-422 38-490 38-557 38-625 38-693 38-760 38-828 38-895 38-962| 7 14 20 27 34 
33 | 39-030 39-097 39-164 39-231 39-298 39-365 39-432 39-499 39-566 39-633 | 7 13 20 27 34 
34 | 39-700 39-766 39-833 39-900 39-966 40-033 40-099 40-166 40-232 40-298| 7 13 20 27 33 
35 | 40-365 40-431 40-497 40-563 40-630 40-696 40-762 40-828 40-894 40-:960| 7 13 20 26 33 
36 | 41-026 41-091 41-157 41-223 41-289 41-355 41-420 41-486 41:55 41-617] 7 13 20 26 33 
37 | 41-683 41-748 41-813 41-879 41-944 42-010 42-075 42-140 42-20% 42-271 | 7 13 20 26 33 
38 | 42-336 42-401 42-466 42-531 42-596 42-661 42-726 42-791 42-856 42-921| 6 13 20 26 32 
39 | 42-986 43-050 43-115 43-180 43-245 43-309 43-374 43-439 43-5023 43-568| 6 13 19 26 32 

i 

40 | 43-633 43-697 43-762 43-826 43-890 43-955 44-019 44084 44-146 44212] 6 13 19 26 32 
41 | 44-277 44-341 44-405 44-469 44-534 44-598 44-662 44-726 44-790 44:854| 6 13 19 26 32 
42 | 44-919 44-983 45-047 45-111 45-175 45-239 45-303 45-367 45-430 45-494] 6 13 19 26 32 
43 | 45-558 45-622 45-686 45-750 45-814 45-877 45-941 46-005 46-069 46132] 6 13 19 26 32 
44 | 46-196 46-260 46-323 46-387 46-451 46-514 46-578 46-642 46-705 46-769| 6 13 19 25 32 
45 | 46-832 46-896 46-960 47-023 47-087 47-150 47-214 47-277 47-341 47-404| 6 13 19 25 32 
46 | 47-467 47-531 47-594 47-658 47-721 47-785 47-848 47-911 47-975 48-038 | 6 13 19 25 32 
47 | 48-101 48-165 48-228 48-292 48-355 48-418 48-482 48-545 48-608 48-671 | 6 13 19 25 32 
48 | 48-735 48-798 48-861 48-925 48-988 49-051 49-114 49-178 49-241 49-304| 6 13 '® 25 32 
49 | 49-368 49:431 49-494 49-557 49-621 49-684 49-747 49-810 49-874 49-937| 6 13 19 25 32 














If 100p > 50, enter the table with 100(1— ) and, for 6, subtract the tabular value from 100. 








72 Tables of the angular transformation 


Table 2. Angular transformation of percentages less than two 











100p|} 0:00 0-01 002 003 004 005 006 007 O08 009 |1 2 3 4 «5 
0-0 | 0-327 0-960 1-222 1-423 1-592 1-741 1-876 2-001 2-116 2-225 ne worey 
0-1 | 2327 2-425 2-518 2-608 2-694 2-777 2-858 2-935 3-011 3-085 } , id 
0-2 | 3-156 3226 3295 3-361 3-427 3491 3-553 3615 3675 3-735 | 6 13 19 26 32 
0-3 | 3-798 3-850 3907 3962 4017 4071 4-124 4177 4228 4279 | 5 11 16 22 27 
0-4 | 4380 4-380 4429 4477 4525 4573 4620 4666 4712 4758 | 5 10 14 19 24 
05 | 4803 4848 4892 4935 4979 5022 5064 5107 5148 5199/4 9 413 417+ 22 
0-6 | 5-231 5272 5312 5352 5392 5432 5471 5510 5548 5587/4 8 12 16 20 
0-7 | 5-625 5-663 5-700 5737 65-774 5811 5848 5884 5920 5956/4 7 11 15 18 
08 | 5-991 6027 6-062 6-097 6132 6166 6201 6235 6269 6303 | 3 7 #10 14 #17 
0-9 | 6336 6369 6403 6486 6469 6501 6534 6566 6598 6630] 3 7 10 13 16 
1:0 | 6662 6-694 6-725 6757 6-788 6819 6850 6881 6912 6942/3 6 9 12 16 
1-1 | 6-973 7-003 7-033 7-063 7-098 7-122 7-152 7-182 7211 7240/3 6 9 12 15 
1:2 | 7-269 7-298 7-327 7-356 7-384 7-413 7-441 7-470 7-498 7526} 3 6 9 11 14 
1:3 | 7-554 7-582 7-609 17-637 7-665 7-692 17-720 7-747 7-774 7801|3 5 8 ll 414 
1-4 | 7-828 7-855 7-882 7-908 7-935 7-961 7-988 8014 8040 8066|3 5 8 11 13 
15 | 8-093 8118 8144 8170 8196 8222 8247 8273 8298 8323/3 5 8 10 13 
16 | 8349 8374 8399 8424 8449 8474 8498 8523 86548 8572|2 65 7 #10 «12 
1:7 | 8-597 8621 8-646 8670 8694 8-718 8742 8766 8790 8814|2 5 7 10 12 
1:8 | 8838 8862 8885 8909 8-933 8956 8980 9-003 9026 9050/2 5 7 9 1 
1:9 | 9-073 9-096 9-119 9-142 9165 9-188 9-211 9-233 9256 927912 65 7 #@ Ui 











Interpolate linearly between tabular values when 0-05<100p<0-2. When 100p<0-05, use the formula, 


6=0-327+ 6-325 /p. For values of 100p greater than 98, enter with 100(1—,) and subtract the tabular value 
from 100. 


Table 3. Angular transformation of proper fractions 





*1/30 | 11-939 | *1/10 | 20-676 | 5/27 | 28-462 | 8/29 | 35-301 |*11/30 | 41-464 | 9/20 | 46-832 
1/29 | 12140 | 3/29 | 21-037 | 3/16 | 28-650 | 5/18 | 35-436 | 7/19 | 41-579 | 5/11 | 47-121 
1/28 | 12-352 | 2/19 | 21-225 | 4/21 | 28-891 | 7/25 | 35-593 | 10/27 | 41-707 | 11/24 | 47-362 
1/27 | 12-575 | 3/28 | 21-418 | 5/26 | 29-038 | 2/7 | 35-994 | 3/8 | 42-010 | 6/13 | 47-565 
1/26 | 12-811 | 1/9 | 21-820 | *1/5 | 29-651 | 7/24 | 36-409 | 11/29 | 42-291 | 13/28 | 47-739 


1/25 | 13-062 | 3/26 | 22-247 | 6/29 | 30-192 | 5/17 | 36-580 | 98/21 | 42-398 | *7/15 | 47-890 
1/24 | 13-328 | 2/17 | 22-470 | 5/24 | 30-304 | 8/27 | 36-731 | 5/13 | 42-636 | 98/17 | 48-139 
1/23 | 13-612 | 3/25 | 22-700 | 4/19 | 30-475 | *3/10 | 36-987 | 7/18 | 42-914 | 9/19 | 48-335 
1/22 | 13-915 | 1/8 | 23-182 | 3/14 | 30-766 | 7/23 | 37-286 | 9/23 | 43-070 | 10/21 | 48-494 
1/21 | 14240 | 3/23 | 23-697 | 5/23 | 31-004 | 4/13 | 37-516 | 11/28 | 43-171 | 11/23 | 48-625 


1/20 | 14-590 | *2/15 | 23-968 | 2/9 | 31-373 | 9/29 | 37-697 | *2/5 | 43-633 | 12/25 | 48-735 
1/19 | 14-967 | 3/22 | 24-248 | 5/22 | 31-756 | 5/16 | 37-844 | 11/27 | 44-110 | 13/27 | 48-829 
1/18 | 15376 | 4/29 | 24-392 | 3/13 | 32-019 | 6/19 | 38-068 | 9/22 | 44-218 | 14/29 | 48-909 
1/17 | 15821 | 1/7 | 24-841 | *7/30 | 32-211 | 7/22 | 38-231 | 7/17 | 44-390 | *1/2 | 50-000 
1/16 | 16-308 | 4/27 | 26-315 | 4/17 | 32-357 | 8/25 | 38-354 | 12/29 | 44-520 


*1/15 | 16-844 | 3/20 | 25-480 | 5/21 | 32-566 | 9/28 | 38-451 | 5/12 | 44-705 
2/29 | 17-133 | 2/13 | 25-819 | 6/25 | 32-707 | *1/3 | 39-253 | 98/19 | 44-986 
1/14 | 17-438 | 3/19 | 26172 | 7/29 | 32-809 | 10/29 | 40-021 | 11/26 | 45-116 
2/27 | 17-760 | 4/25 | 26-354 | 1/4 | 33-442 | 9/26 | 40-109 | 3/7 | 45-467 
1/13 | 18101 | *1/6 | 26-924 | 7/27 | 34-114 | 8/23 | 40-221 |*13/30 | 45-771 


2/25 | 18-463 | 5/29 | 27-408 | 6/23 | 34-231 | 7/20 | 40-365 | 10/23 | 45-864 
1/12 | 18-848 | 4/23 | 27-534 | 5/19 | 34-395 | 6/17 | 40-560 | 7/16 | 46-037 
2/23 | 19-259 | 3/17 | 27-746 | *4/15 | 34-647 | 5/14 | 40-837 | 11/25 | 46-196 
1/11 | 19-698 | 5/28 | 27-920 | 7/26 | 34-830 | 9/25 | 41-026 | 4/9 | 46-479 
2/21 | 20-169 | 2/11 | 28-187 | 3/11 | 35-078 | 4/11 | 41-265 | 13/29 | 46-723 









































For fractions exceeding one-half, subtract the fraction from unity and the angle from one hundred. Each thirtieth 
is marked by an asterisk. . 





eS 




















ae 
rtieth 





W. L. STEVENS 73 


None of these advantages can be considered as overwhelming, but, since a new tabulation 
was to be undertaken, they were sufficient to lead to the choice of this form for the trans- 
formation. As an alternative, we also considered taking @ as one-tenth of the expression 
above but the resultant table would look so much like a table of probits that it would not 
be at all difficult to enter one in mistake for the other. 

Table 1 gives 6 for p = 0(0-1)50 with proportional parts for linear interpolation. If 
100p > 50, enter the table with 100(1—) and subtract the tabular value from 100. For 
example, if 100p = 62-77, we enter with 37-23 and finding that 6 = 41-833 obtain for the 
required transformed variable a value of 58-167. Table 2 gives a finer tabulation for the first 
2%, p = 0(0-01) 2 %, and hence for the last 2% of the range, although we note that it 
will not usually be profitable to use the angular transformation when p is close to 0 or 100. 
Table 3, following the pattern of the original table in Fisher & Yates, gives the angles 
corresponding to proper fractions with denominators not exceeding 30. 

The writer has also calculated a table of angular values for adjustments of special accuracy, 
analogous to Table XIV given by Fisher & Yates (1938). This table is not reproduced here. 


The tables were computed with the aid of British Association Mathematical Tables, 
vol. 1 (1931). 


REFERENCES 


Bart ett, M. S. (1937). Subsampling for attributes. Suppl. J. R. Statist. Soc. 4, 131-5. 

Buss, C. I. (1937). The analysis of field experimental data expressed in percentages. Pl. Prot., 
Leningr., 12, 70-72 (in Russian). 

British Association Mathematical Tables (1931). I. Circular and Hyperbolic Functions. Cambridge 
University Press. 

EtsEenuaRrt, C., Hastay, M. W. & Watts, W. A. (1947). Selected Techniques of Statistical Analysis. 
New York: McGraw-Hill Book Co. Inc. 

Fisuer, R. A. (1922). On the dominance ratio. Proc. Roy. Soc. Edinb. 42, 321-41. 

Fisuer, R. A. & Yates, F. (1938). Statistical Tables for Biological, Agricultural and Medical Research. 
London: Oliver and Boyd. 

SnEDECOR, G. W. (1946). Statistical Methods. The Iowa State College Press. 











[ 74 ] 


TESTS OF SIGNIFICANCE IN A 2x2 CONTINGENCY TABLE: 
EXTENSION OF FINNEY’S TABLE 


Computed By R. LATSCHA 
Institute of Actuarial Science, University of Berne 


EDITORIAL FOREWORD 


Finney (1948) has given a table which may be used to test the significance of the deviation 

from proportionality in any 2 x 2 contingency table having both the frequencies in one of 

its margins less than or equal to 15. The table printed below extends the range of Finney’s 

table up to marginal frequencies of 20. As the interpretation and uses of the new table are 

exactly similar to those of the 1948 table, only a brief introductory statement is required.* 
Using Finney’s notation, the contingency table should be arranged in the form 














Number of 
Total 
Successes Failures 
Series I a A-a A 
Series IT b B-b B 
r=a+b A+B-a-b A+B 




















where series I is defined to be that which makes A > B, and the type of observation con- 
ventionally regarded as a ‘success’ is that which makes a/A >b/B. The table of significance 
levels is arranged in sections according to the value of A; the sections for A = 9(1) 15 were 
given by Finney, while those for A = 16(1)20 computed by Latscha are printed below. 

For given data, the table is entered in the section for A, the subsection for B and the 
line for a; then the main body of the table shows in bold type the appropriate significance 
points for b. Thus if the observed value of b is equal to or less than the bold integer in the 
column headed 0-05, 0-025, 0-01 or 0-005, then a/A is significantly greater than b/B (single- 
tail test) at these probability levels. On the other hand, for the two-tail test, if b is equal 
to or less than the integer in a given column, a/A is significantly different from b/B at 
a probability level equal to twice the figure heading that column, i.e. at the 0-10, 0-05, 
0-02 and 0-01 levels, respectively. A dash, or absence of an entry, for some combination of 
A, B and a indicates that no 2 x 2 table in that class can show a significant effect at that 
level. 

Owing to the discontinuous character of the hypergeometric distribution, the con- 
ditional probability that, for a given value of a+, b will be equal to or less than the value 
specified in bold type will generally be less, and often very considerably less, than that 
shown at the head of the column; the true probabilities are given in small type. 

* Copies of Finney’s table are available as a Biometrika ‘separate’ and the present extension will 


be made available in similar form. Finney’s table, but not the extension, is included in the new 
Biometrika Tables for Statisticians now at Press. 





si 





R. LatscHa 75 


As an illustration, we may use Lange’s data on criminality among twin brothers or 
sisters of criminals (Fisher, 1946, §21-01). This example was taken by Finney (1948), but 
as A > 15 he used it to show how his table could be extended under certain conditions. As 
A < 20, direct entry is now possible in Latscha’s table. 

The contingency table shows the number of twin brother or sisters of criminals who had 
also been convicted of crime, classing separately monozygotic and dizygotic (but like- 
sexed) twins: 




















Not convicted Convicted Total 
Dizygotic 15 (=a) 2 17 (= A) 
Monozygotic’ 3 (= 6) 10 13 (= B) 
Total 18 12 30 














Following the rule given above, the letters A, B, a and 6 are associated with the observed 
frequencies as shown. The null-hypothesis is that the twin of a criminal is no more likely 
to be convicted of crime if the twinning is monozygotic than if it is dizygotic. If the only 
deviation from the hypothesis which we are prepared to consider is that monozygotic 
twins behave more similarly than dizygotic, a single-tail test will be appropriate and we 
shall ask whether a/A = 15/17 is significantly greater than 6/B = 3/13. 

Turning to the appropriate section of the table with A = 17, B = 13 and a = 15 we find 
that the observed value of b = 3 is significant at the 0-5 % level, since it is less than 4, the 
last entry in the row of bold figures. 

The figure in small type following b = 4 indicates that for a contingency table with 
marginal frequencies 


A=17, B=13, r=a+b=154+4=19, A+B-a-b=ll, 


the conditional probability of an arrangement within the table with b< 4 is 0-002 on the 
null hypothesis of independence. The probability that 6<3 within the observed table 
(having a+ 6 = 18) is not recorded, but is < 0-002. 

As far as possible, checks on the internal consistency of the table have been made as 
well as comparisons with the more extensive tables for the special case A = B published 
by Swaroop (1950). 


REFERENCES 


Finney, D. J. (1948). Biometrika, 35, 148. 
FIsHER, R. A. (1946). Statistical Methods for Research Workers, 10th ed. Edinburgh: Oliver and Boyd. 
Swaroop, S. (1950). Indian Med. Res. Mem., no. 35. 








Significance tests in a 2 x 2 contingency table 





















































——— 
Probability Probability 
a a a 
0-05 0-025 0-01 0-005 0-05 0-025 0-01 0-005 
}—— 
A=16 B=16| 16 | 11 022 | 11 022 | 10 009 | 9 003 |A=16 B=12| 16| 8 024 | 8 024 | 7-008 | 6 mA=16 B: 
15 | 10 -041 9 -019 8 -008 7 -003 15 | 7 036 | 6 -013 5 -004 5 04 
14 | 8 -027_-| 7 -o12 6 -005-| 6 -005- 14 | 6 -040 5 -015-| 4 -005-| 4 os 
13 | 7 -033 6 -015s-| 5 -006 4 -002 13 | 5 -039 4 -014 3 -004 3 -0%4 
12 | 6 037 | 5 016 | 4-006 | 3 -002 12| 4 034 | 3 012 | 2 003 | 2% 
11 | 5 -038 4 016 3 006 | 2 -002 11 | 3 -027 2 -008 2 -008 1 
10 | 4 -037 3 -015-| 2 -00s-| 2 -00s- 10 | 2 -019 2 -019 1 -005-| 1 -00s 
9] 3 -033 2 -012 1 -003 1 -003 9| 2 -040 1 -011 0 002 | 0 -m 
8 | 2 027 1 -008 1 -008 0 -001 8 | 1 -024 1 -024 0 004 | O -m 
7} 1 -o19 1 -019 0 -003 0 -003 7 | 1 -048 0 010-| O -o10-| — 
6| 1 -041 0 009 | © 009 | — 6| O -021 0 021 | — Sy 
5 | O 022 0 022 | — — 5| 0 044 |— ues in 
15| 16 |11 043 |10 018 | 9 007 | § -o02 
15 | 9 -033 8 -014 7 005+ | 6 -002 11| 16 | 7 -019 7 -019 6 -006 5 -m 
14 | 8 044 | 7 -o19 6 -008 5 -003 15 | 6 -027 5 -009 5 009 | 4 
13 | 6 -023 6 -023 5 -009 4 -003 14| 5 -027 4 -009 4 -009 3 -om 
12 | 5 -02 5 -024 4 -009 3 -003 13 | 4 -024 4 -024 3 -008 2 
11 | 4 -023 4 -023 3 -008 2 -002 12 | 3 -019 3 019 | 2 oost| 1 m1 
10 | 4 -049 3 020 | 2 -006 1 -001 11 | 3 -041 2 -013 1 -003 1 0 
9] 3 -043 2 -016 1 -004 1 -004 10 | 2 -028 1 -007 1 -007 0 wi 
8 | 2 -035-| 1 -o10+| Q -002 0 -002 9| 1 -016 1 -016 0 -002 0 
7 | 1 -023 1 -023 0 -004 0 -004 8 | 1 -033 0 006 | 0 006 | — 
6| 0-01 | O-on | — _ 7| 0-013 | 0-013 | — —_ 
5| 0 026 | — — _— 6| 0-027 | — oe ~~ 
14| 16 | 10 -037 9 014 8 -00s+| 7 -002 
15 | 8 025+] 7 -o10-; 7 -010-| 6 -003 10} 16 | 7 -046 6 -014 5 -004 5 04 
14| 7 -032 6 -013 5 -005-| 5 -00s- 15 | 5 -018 § -018 4 005+ | 3 00 
13 | 6 -035+| 5 -014 4 005+ | 3 -001 14) 4 -018 4 -018 3 -005-| 3 os 
12 | 5 -035+| 4 014 3 -005s-| 3 -00s- 13 | 4-042. | 3 -014 2 -003 2 0 
11 | 4 -033 3 -012 2 -004 2 004 12 | 3 -032 2 -009 2 -009 1m 
10 | 3 -028 2 -009 2 -009 1 -002 11 | 2 -o21 2 021 1 -00s-| 1 00 
9) 2 021 2 021 1 -006 0 -001 10 | 2 -042 1 -o11 0 002 | 0 -m 
8 | 2 -o4s-| 1 -013 0 -002 0 -002 9] 1 -023 1 -023 0 004 | O -m 
7 | 1 030 0 -006 0 006 | — 8] 1 -045-| O -008 0 cos | — 
6| O -013 0 013 | — _ 7\ 0-017 0-017 | — — 
3) Om i= - on 6| 0 -03s-| — mi ata 
13| 16 | 9 -030 8 -011 7 -004 7 -004 
15 | 8 -047 7 019 6 -007 5 -002 9 | 16| 6 -037 5 010-| 5 -010-| 4 -m 
14 | 6 023 | 6 -023 5 -008 4 -003 15 | 5-040 | 4 -012 3 -003 3 -003 
13 | 5 -023 5 023 | 4 -008 3 -003 14| 4 -034 3 -010-| 3 -o10-| 2 om 
12 | 4-022 | 4-022 | 3-007 | 2 -o02 13 | 3 -025+| 2-007 | 2-007 | 1 
11 | 4 -048 3 018 | 2 -oos+| 1 -001 12| 2-016 | 2 -016 1 -003 1 
10 | 3 039 | 2 -013 1 -003 1 -003 11 | 2 -033 1 -008 1 -oo8 | O m1 
9| 2 -029 1 -008 1 -008 | O -o01 10 | 1 -017 1 017 | 0 002 | 0 -m 
8 | 1 -o18 1 -018 0 -003 0 -003 9} 1-034 | © -006 0 006 | — 
7 | 1 038 0 -007 0 007 | — 8| 0-012 | O -o12 | — —_— 
6| 0 017 0 017 | — — 7| 0-024 | 0 07% | — — 
5| 0-037 | — _ — 6| O 045+) — — — 





The table shows: 


(1) In bold type, for given A, B and a, the value of 6 (<a) which is just significant at the probability lev 


quoted (single-tail test). 


(2) In small type, for given A, B and r=a+b, the exact probability (if there is independence) that 6 is equal’ 


or less than the integer shown in bold type. 











0-005 


SSSSSSS Ses 


PVESeSrreee 


























acim oe | | | | 


seine See it tec Oe a SMWOMEMAMMSOS | | ANONTMANAMMOSS | | ONOMEMANKOOS | | | 
+ + | 
5s |e | 8 3 SSssSSSSFSF8 BSSsssss sss 
: } 
s 00 So ss 4.3 8: oe 3 o-oo. ee Sig tage Oe Be ab 
& a s858 88 SSSSSSSSsSSSS SSFSSSSISISS FESSSESS 
} mooo|] oo | SOHMONTMANMOS MADONMTMMNA MMS 
) Sse 5 a g3a8 aa See 34 Ss ot 0 
3 $3388 8838 | 39883333338 SBPISHESITIS 8 
mOoooo ooo Anaoroncanne acanrwnrtamnane “ 
Ss oe be oe ~ 
ONnrann wont ™oOnPTNAN KH ODorwon FON TMNA KK OAcor ™ \o -OQcorwon 
= Se et = eet eet et et et” et et oe eet et a —_—_— 
m a = © 2 
ll | ll 
) =) 
© low 
— — 
| il 
< | < 
! ! 
“_ = - - a= —“— oO — 
S| 8888888 s88888 $8288 88% 
as aaae wee} 11 wren Lt ef 





0-01 


SESses88 


FON Ooo | | 


1 
3882328 
AN OS SO | | | 


s82588 EE 





Probability 


Significance tests in a 2 x 2 contingency table (continued) 











\ 
cr om vt Ll cee ON ce rt a _ om nom 
QR | 888585885 85535885 8858858 $858 : 
4 TAMA BOOS | oe Be) | | ANAK OOS | Aen Oooo | =e OOS | | 
sftets | at eneease285 4) Roageag a 
3 S§SEISHSSS SsssSssssss sessgssss z 3 
°o NTANAKHOOCOSO PANNA OOSO AAA OOO oe) o 
ONnNnTNARODOM ONnNAMNAROHOM ONAMNARK OAD wont =—=OonN 
~~ SS et et et et SS et et et et et = SS St et et et = = —_— 











=8 





6 mA=16 B 
5 004 
4 Ms 
3 -004 
2 03 
1 © 
1 00s 
0 
0 om 


Zeee8888 





Sseessssss 








is equal 


bility lev 








Significance tests in a 2x 2 contingency table (continued) 





















































The table shows: 


(1) In bold type, for given A, B and a, the value of 6 (<a) which is just significant at the probability le 


quoted (single-tail test). 


(2) In small type, for given A, B and r=a+6, the exact probability (if there is independence) that 6 is equal 


or less than the integer shown in bold type. 





Probability Probability 
a a a 
0-05 0-025 0-01 0-005 0-05 0-025 0-01 0-005 
“= 
A=17 B=14| 17 | 10 032 | 9 -012 8 004 | 8 004 | A=17 B=11] 13 | 4 -042 3-014 | 2 004 | 2 =17 B 
16 | 8 -o21 8 -021 7 -008 6 -003 12} 3 -031 2 -009 2 -009 1 
15 | 7 026 | 6 -010-| 6 -010-| 5 -003 11 | 2 -020 2 -020 1 -oos-| 1 
14 | 6 -028 5 -011 4 004 | 4 -004 10 | 2 -040 1 -o11 0 001 | 0 
13 | 5 027 | 4-010-| 4 -010-| 3 -003 9] 1 -022 1 022 | O 004 | 0 
12 | 4 -024 4 -024 3 -008 2 -002 8 | 1 -042 0 -008 0 008 | — 
11 | 4 -049 3 -019 2 -006 1 -001 7| 0-016 | 0 0f6 | — a 
10 | 3 -040 2 -014 1 -003 1 -003 6| 0 033 | — _ we 
9| 2 029 1 -008 1 -008 0 -001 
8 | 1-018 | 1-018 | 0 003 | O -003 10|17| 7-041 | 6 012 | 5-003 | 5; 
7 | 1-038 | 0 007 | O 007 | — 16 | 6 047 | 5 o1st| 4-004 | 4 
6| 0-017 | 0 017 | — 15 | 5 -043 4 -014 3 004 | 3 
5| 0-036 | — — — 14 | 4-034 | 3 010+} 2-002 | 2 
13 | 3 024 3 024 | 2 -007 1 
13| 17 | 9 -026 8 -009 8 -009 7 003 12 | 3 -049 2 015+} 1 -003 1 
16 | 8 -040 7 015+ | 6 00st} 5 -002 11 | 2 -031 1 -007 1 007 | 0 
15 | 7 045+] 6 018 5 -006 4 -002 10 | 1 -016 1 -016 0 002 | 0 
14| 6 045+] 5 -018 4 -006 3 -002 9| 1 -031 0 -00s+} O -005+ | — 
13 | 5 -042 4 -016 3 -00s+| 2 -001 8 | O -o11 0 -o1 | — = 
12 | 4 -035+| 3 -013 2 -004 2 -004 71 0 -022 0 022 | — ia 
11 3 028 2 -009 2 -009 1 -002 6] 0 042 | — —_— ~— 
10 | 2 -019 2 019 1 -00s-| 1 -00s- 
9| 2 040 1 -o11 | G 002 | O -002 9 |17| 6-032 | 5 008 | 5 cos | 4 
8 | 1-024 | 1 024 | 0 004 | O -004 16 | 5 034 | 4 -010-| 4 -o10-| 3 
7 | 1-047 | 0 o10-| O -o10-| — 15 | 4 -028 3 -008 3-008 | 2 
6 0 021 | O oz | — — 14| 3-020 | 3-020 | 2 -oos-| 2 
5| 0-043 | — _ —_ 13 | 3-04 | 2-012 | 1-002 | 1 
12 | 2 025+} 1 006 | 1 -006 | 0 mi 
12} 17| 8-02 | 8 02 | 7 007 | 6 -002 11 | 2-048 | 1-012 | O 002 | O 
16| 7-030 | 6 o11 | 5 003 | 5 -003 10 | 1-024 | 1-024 | O 004 | 0 
15 | 6 033 | 5-012 | 4-004 | 4 -004 9 | 1 -045-| 0 008 | O 008 | — 
14| 5 -030 4 011 3 -003 3 -003 8 | 0 016 0 016 | — _ 
13 | 4 -026 3 -008 3 -008 2 -002 7} 0 030 | — _ 
12 | 3 -020 3 -020 2 006 1 -001 
11 | 3 041 | 2-013 | 1-003 | 1 -003 8 | 17| 5-024 | 5 0246 | 4-006 | 3 
10 | 2 -028 1 -007 1 -007 0 -001 16 | 4 -023 4 -023 3 -006 2 wi 
9| 1-016 | 1-016 | O coz | O -002 15 | 3 017 | 3 017 | 2 004 | 2 
S| 12 | Os | Om | — 14| 3 039 | 2-o10-| 2 -o1-| 1 
7} 0 012 | O 012 | — —_ 13 | 2-022 | 2-022 | 1-004 | 1 
6 | 0 02% | — ~- cat 12 | 2-043 | 1 -o10-| 1 -o10-| 0 
11 | 1 -020 1 020 | 0 003 | 0 
11| 17 | 7 016 7 -016 6 -005-| 6 -005- 10 | 1 -038 0 -006 0 006 | — 
16 | 6 022 6 -022 5 007 | 4 -002 9| 0-012 | 0 012 | — — 
15 | 5 022 | § 022 | 4-007 | 3 -002 8} 0-022 | O 022 | — — 
14| 4-019 | 4-019 | 3 006 | 2 001 7} 0-040 |— +e a 


+ 





Significance tests in a 2x 2 contingency table (continued) 


















































Probability Probability 
a a 
0-005 0-05 0-025 0-01 0-005 0-05 0-025 001 0-005 
2 ofA=17 B=7 | 17 | 4-017 | 4-017 3 -003 3 003 |A=18 B=18| 18 | 13 -023 13 -023 | 12 -o10-| 11 -004 
1 © 16 | 3 -014 3 -014 2 -003 2 -003 17 | 12 -044 | 11 -020 | 10 -009 9 -004 
1 15 | 3-038 | 2-009 | 2 -o09 1 -001 16 | 10 030 | 9-014 | 8 -006 | 7 -o02 
0 -m 14} 2 -o21 2 -021 1 -004 1 -004 15 | 9 -038 8 -018 7 008 | 6 -003 
O 04 13 | 2 -042 1 -009 1 -009 0 -00i 14] 8 -043 7 020 6 -009 5 -003 
_ 12} 1 -o18 1 -018 0 -002 0 -002 13 | 7 -046 6 -022 5 -009 4-003: 
_ 11 1 -034 0 -005-; 0 -005-!' Q -o0s- 12 | 6 -047 5 022 4 -009 3 -003 
aes 10 | 0 -010-| Q -010-| © -o10-| — 11 | 5 -046 4 -020 3 -008 2 -002 
9 0 -019 0 -019 — , aks 10 4 -043 3 -o18 2 -006 1 -001 
5 0 8 0 -033 —_ —_ anti 9 3 -038 2 -014 1 -004 1 -004 
4 8 | 2 -030 1 -009 1 -009 0 -001 
3 6 | 17] 3 -o1 3 -011 2 002 | 2 -002 7 | 1 -020 1 020 | 0 -006 | © -004 
2 16 | 3-040 | 2-008 | 2 -08 1 -001 6 | 1-044 | O -010-| 6 -o10-| — 
1 15 | 2 -o21 2 021 1 -003 1 -003 5 | 0-023 | 0 023 | — aes 
1 14| 2 -045+| 1 -009 1 -009 0 -001 
0 13} 1-018 | 1-018 | Q 02 | 0 -o02 17 | 18 | 13 -045+| 12 -o19 | 11 -008 | 10 -o03 
0 12 | 1 -035-| 0 -00s-| 0 -o0s-| 0 -o0s- 17 | 11 -036 | 10 016 | 9 007 | § -o02 
pe. 11 | O -009 0 -009 0 009 | — 16 | 10 -049 9 -023 8 -010-| 7 -004 
a 10 | 0 -017 0 017 | — red 15 | 8 -028 7 012 | 6 -005-| 6 -005- 
9] 0-030 | — fates seit 14} 7-030 | 6 -013 5 -o0s+| 4 -002 
er 8 | O -oso-| — ae oe 13 | 6 -031 5 -013 | 4 -005-| 4 -oos- 
12 | 5 030 | 4 -012 3 -004 3 -004 
4 5 [17] 3-043 | 2-006 | 2-006 | 4 -oo1 11 | 4-028 | 3 -o10+| 2 003 | 2 -003 
3 16 | 2 024 | 2-024 | 1-003 | 1 -003 10 | 3 023 | 3-023 | 2-008 | 1 002 
2 15 | 1-009 | 1-009 | 4-009 | © -001 9| 3047 | 2 018 | 1 -oos-! 1 -oos- 
2 14; 1-021 | 1-021 | 0 02 | 0 -o2 8} 2 poudll Be eel ee buodd Be pect 
1 13 | 1-039 | © -00s-| © -o0s-| © -oos- 7 | 1 02s~| 1 -025-| © -005-| © -oos 
0 wi 12 | 0 010-| 0 -o10-| 0 -o10-| — 6; Oo | O-on | — — 
0 11 | 0 018 | 0 -o1s | — pee 5] 0 02 | — ae isi 
0 0% 10 | 0-030 | — gi Fe 
9|/ 0-04 | — cae ae 16} 18 | 12 -039 | 11 -016 | 10-006 | 9 -o02 
me 17 | 10 029 | 9 012 | 8 -cos-| § -o0s- 
—] 4} tt] 200 | ra | 05 | 10 sl eme| role ee 
16] 1 011 1 -o11 0 -001 0 -001 14| 7-046 | 6 -020 5 008 | 4 -003 
3 00 15 | 1 -028 0 -003 0 -003 0 -003 13 | 6 045+} § 020 | 4 -007 3 -002 
2 wi 14 | © 005 | 0 006 | 0 006 | — 12} 5 042 | 4-018 | 3-006 | 2 002 
2 13 | 0-012 | 0 -o12 | — “le ahaa 3 -o1s-| 2-006 | 2 -004 
1 od a Nels ies 10} 3 031 | 2-011 | 1-003 | 1 -003 
1 IL | O s+) — pa 4 9} 2-023 | 2-023 | 1-006 | Q -001 
0 wi 8| 2-04 | 1-014 | 0 002 | 0 002 
0 3 | 17] 1 -o16 1 -016 0 -001 0 -001 71 1-030 0 -006 0 006 | — 
Es 16 | 1 -046 0 -004 0 -.004 | 0 -004 6| 0 -014 0 014 | — < 
= 15 0 009 0 “009 0 “009 | _ 5 0 031 oe es Ne 
sn 14 0 -018 0 -o18 | — ee 
eR 13 | 0-031 | — _ —— 15} 18 | 11 -033 | 10 -013 | 9 -o05-| 9 -o0s- 
12) 0 049 | — |— _ 17 | 9-023 | 9023 | 8 009 | 7 -003 
16 | 8 -029 7 -012 6 -004 6 -004 
ig 2 | 17} 0 006 | 0 006 | 0 006 | — 15 | 7 031 | 6 013 | § -005-| § -oos- 
vility ' 16; 0-018 | 0 o18 | — a 14 | 6 -031 5 013 | 4-004 | 4 -004 
15 | 0 035+} — —_ _ 13 | 5 029 | 4-011 | 3-004 | 3 -004 
is equal 

























































































Significance tests in a 2 x 2 contingency table (continued) 
Probability Probability 
a a ae 
0-05 0-025 0-01 0-005 0-05 0-025 0-01 0-005 
A=18 B=15/ 12 | 4 025+} 3 009 | 3-009 | 2-003 |A=18 B=12| 10/| 2-038 | 1 -o10+| 0-001 | 0 / 18 
11 | 3 -020 3 -020 2 -006 1 -001 9} 1 021 1 -021 0 -003 0 «) 
10 |.3 041 | 2-014 | 1-004 | 1 -004 8 | 1-04 | 0 007 | 0-07 |— ! 
9| 2 -030 1 -008 1 -008 0 -001 7) O -016 0 016 | — —_ 
8 1 -018 1 -018 0 -003 0 -003 6| 0-031 | — _— —_ 
7 1 -038 0 -007 0 007 | — 
6 | 0 -017 0 017 | — “a 11] 18 | 8 -o4s+] 7 -014 6 004 | 6 -% 
5| 0 036 | — — ane 17 | 6 -018 6 -018 5 0066 | 4-0 
16 | 5 018 | 5 -018 | 4 -005+| 3 
14/ 18 | 10 -o28 9 -010-| 9 -o10-| 8 -003 15 | 5 -043 4 015-| 3 -004 3 0 
17 | 9 -043 8 -017 7 -006 6 -002 14} 4 -033 3 -o11 2 003 | 2 
16 | 8 -0so-| 7 -o21 6 -008 5 -003 13 | 3 -023 3 -023 2 -007 1 -0; 
15 | 6 -022 6 -022 5 -008 4 -003 12 | 3 -046 2 -014 1 -003 00: 
14 | 6 -049 5 -020 4 -007 3 -002 11 | 2 -029 1 -007 1 -007 00 
13 | 5 -044 4 -017 3 -006 2 -001 10 | 1 -015-] 1 -o15-| 0 002 | O 0 
12 | 4 -037 3 -013 2 -004 2 -004 9} 1 -029 0 -oss-| O -00s- 00: 
11 | 3 -028 2 -009 2 009 1 -002 8 | O 010+] O -oro+} — — 
10 | 2 -020 2 020 1 -00s-} 1 -00s- 7 | 0 -020 0 020 | — _ 
9} 2 -039 1 -011 0 -002 0 -002 6| 0 039 | — _ _ 
8 | 1 -024 1 024 | 0 -004 | O -004 
7 | 1 047 0 -009 0 009 | — 10| 18 | 7 -037 6 ort; § 003 | 5 alin 
6| 0 -020 0 020 | — <a 17 | 6 041 5 -013 4-003 | 40/% 
5| 0-043 | -— oe = 16| 5-036 | 4-01 3 003 | 3 4 jx 
15 | 4-028 | 3 -008 3-008 | 2 0 
13} 18 | 9 -023 9 -023 8 -008 7 -002 14} 3 -019 3 -019 2 oos-| 2-0 
17 | 8 -034 7 012 6 004 | 6 -004 13 | 3 -039 2 011 1 002 | 1 € jx 
16 | 7 -037 6 -014 5 -00s-} 5 -00s- 12 | 2 -023 2 -023 1 -o0s+ |} O04 | 4 
15 | 6 -036 5 014 4 -004 4 -004 11 | 2 -043 1 -011 0 -001 0 |" 
14 | 5 032 | 4 012 3 -004 3 -004 10 | 1 -022 1 022 | 0 003 | 0 0 
13 | 4 027 3 -009 3 -009 2 -002 9| 1 -040 0 -007 0 007 | — 
12 | 3 -02 3 -020 2 -006 1 -001 8| 0-014 | 0 014 | — — 
11 | 3 -040 2 -013 1 -003 1 -003 7| 0-027 | — _ — 
10 | 2 027 1 -007 1 -007 0 -001 6| 0-04 | — —_ — 
9 | 1 015+} 1 015+] Q 002 | O -002 
8 | 1-031 | 0 006 | 0 006 | — 9 118 | 6-029 | 5 007 | 5 007 | 4- 
7| 0-012 | O -o12 | — — 17 | 5 030 | 4-008 | 4-008 | 3 
6} O -o2s+} — oa ae 16 | 4 023 4 023 3 006 | 2 Ol 
15 | 3 016 3 016 | 2 004 | 2 0% 
12} 18 | 8 -o18 8 -018 7 -006 6 -002 14! 3 -034 2 -009 2 -009 1 -002 
17 | 7 -026 6 -009 6 -009 5 -003 13 | 2 019 2 019 1 -004 1 -004 
16 | 6 027 5 -009 5 -009 4 -003 12 | 2 -037 1 -009 1 009 | O 
15 | 5 02 5 -024 4 -008 3 -002 11 | 1 018 1 -018 0 002 | 0: 
14} 4 -020 4 -020 3 -006 2 -001 10 1 -033 0 -00s+ | O -oost+ | — | 
13 | 4 -042 3 -014 2 -004 2 -004 9| O o10+; O o10+}) — — | 
12 | 3 -030 2 -009 2 -009 1 -002 8| 0-020 | 0 020 | — — 
11 | 2 -o19 2 019 1 -00s-| 1 -00s- 7| 0-03 |— _ — 
The table shows: 
(1) In bold type, for given A, B and a, the value of 6 (<a) which is just significant at the probability lev 
quoted (single-tail test). 
(2) In small type, for given A, B and r=a+b, the exact probability (if there is independence) that 6 is equal 


or less than the integer shown in bold type. 





os 
| 
| 
































Significance tests in a 2 x 2 contingency table (continued) 
Probability Probability 
i a a 
1 0-005 0-05 0-025 0-01 0-005 0-05 0-025 001 0-005 
1 | O/ -1g B=s | 18| 5-022 | 5 022 | 4-005-| 4 -005- A=18 B=4 | 13| 0-017 | 0 017 | — Ni 
03) 0 4 17| 4-020 | 4-020 | 3 004 | 3 -004 ol ees |x a. at 
ft hes, 16 | 3-014 | 3 014 | 2-003 | 2 -003 11 | © -o4s+} — <6 ain 
Wir: 15 | 3 -032 2 -008 2 -008 1 -001 
Ba 14 | 2 -017 2 -017 1 -003 1 -003 3 | 18 | 1 -o14 1 -014 0 -001 0 -001 
13 | 2 -034 1 -007 1 -007 0 -001 17 | 1 -041 0 -003 0 -003 0 -003 
4 | 6% 12 | 1 015+} 1 -015+] © -002 | © -002 16°| © -008 | 0 008 | O -oos | — 
06 | 40 11 | 1-028 | © 004 | © 004 | © -004 15 | 0 -o1s+| © -o1s+| — = 
05+ ; w 10 | 1-04 | Q 008 | 0 -oos | — 14| 0 02 | — He cup 
2 * 9} 0-016 | 0 o16 | — ot 13 | Oo | — a as 
3 | 2-0 02g | —_ hl wae 
07 | 1 0 ; ; po ee das! tg 2 | 18 | O -00s+| © -005+| © -oos+ | — 
03 1 ©: 17 | 0 -016 0 016 | — ess 
7 | 0% 7 | 18 | 4 -015+| 4 -o15+| 3-003 | 3 -003 16 | 0 032 | — — — 
i 0 be 17 | 3 012 | 3 012 | 2-002 | 2 -002 
ds A -adl Bred ell Rael VOR ER ee ee eee ee 
15 | 2 -017 2 -017 1 -003 1 -003 18 |13 a 
mex: -045~ | 12'-021 | 11 -009 | 10 -004 
14 | 2 -034 1 -007 1 -007 0 -001 17111 031 |10 -o1s-| 9 -006 Pape 
Te 13 | 1 -014 1 -014 0 -002 0 -002 16 | 10 
-039 9 -019 8 -009 7 -003 
12 | 1 -027 0 -004 0 -004 0 -004 
023 | 5 Oo | K 11| 1-04 | © -007 0-0 |— 15 | 9 046 | 8 022 | 6 -004 | 6 -004 
023 | 40K 10 | O -013 0 013 | — ro 14} 8 -0so-| 7 -024 5 -004 5 -004 
03 3 & jm 9] O -024 0 024 | — mies 13 | 6 025+] 5 -o11 4 -004 4 -004 
os | 2 00 8 | 0 04 12 | 5 024 | 5 -024 3 -003 3 -003 
os-| 20 a, ast me 11 | 5 -oso-| 4-022 | 3 -009 | 2 -003 
2 | Le ln 6 | 18} 3 -010-| 3 o10-| 3 -o10-| 2 -oo1 r ; = ; ed i ys : os 
s+] O04 |i6 17 | 3 035+] 2 006 | 2-006 | 1-001 sl tarball tates 
1 | 0 ey 16 | 2-018 | 2-018 | 1-003 | 1 -003 
03 | O 0 7| 1-021 1 -021 0 -004 0 -004 
15 | 2 -038 1 -007 1 -007 0 -001 6) tee bea! tari. 
al ‘ap: 14| 1 -01s-| 1 -015-| © 002 | © -002 5 | 0-023 | 0 -023 
uri 13 | 1 -028 0 -003 0 -003 0 -003 WE wo 
ix 2) we | Oe ae. | — 18] 19 | 14 046 |13 020 | 12 -o08 | 11 -003 
fs, 11 | 0 013 | O 013 | — ie, 18 | 12 037 |11 -017 |10 007 | 9 -003 
7 | 4m Nn), om | oem |— om 17 | 10 024 |10 024 | § 004 | 8 -004 
| 3 9| 0 07 | — se 16 | 9-030 | 8-014 | 7-006 | 6 -002 
6 2 001 5 118 | 3-04 iP as te aus we 15 | 8 -033 7 015+ | 6 -006 5 -002 
4 | 2m 17 | 2-021 | 2-021 1 -003 | 1 -003 1 : 4 snd a So 
9 1 02 os © a gee Fes ieee: 13 | 6 -035 5 -015 4 bool 3 ‘002 
4 1 -004 is | 1-017 : es } ae a 12 : 033 : 014 : -005 : -005 
9 0 001 141 1-033 oan 6 am ‘a 11 -030 O11 -004 -004 
2 | Om 10 | 3 -02s-| 3 -025-| 2 008 | 1 -002 
st | “ : 007 ; 007 | 0 007 | — 9| 3-049 | 2-019 | 1 -o05+| O -001 
a: 014 ld as = 8 | 2-038 | 1-012 | 0 002 | O -002 
we A : 024 | 0 024 | — oo 7 | 1 025+} © -005-| © -00s-| © -o0s- 
= 038 | — —- —_ 6| O -012 0 012 | — a 
e: 5 | 0-027 | — ‘big ae 
4| 18 | 2 -026 1 -003 1 -003 1 -003 
17 | 1 -010-| 1 -o10-| 1 -o10-| 0 -001 17| 19 | 13 040 | 12 -016 | 11 -006 | 10 -002 
ability ley 16 | 1 -024 1 -024 0 -002 0 -002 18 | 11 -030 | 10 -013 9 -.005+| 8 -002 
15 | 1 046 | 0 -005-| 0 -005-| Q -oos- 17 | 10 040 | 9 -o18 8 -008 7 -003 
b is equal 14 | 0 010-| 0 010-| O -o10-} — 16 | 9 047 | 8 022 | 7 009 | 6 -003 





|! 









































Biometrika 40 













































































Significance tests in'a 2 x 2 contingency table (continued) 
= 7a 
Probability Probability 
a a 
0-05 0-025 0-01 0-005 0-05 0-025 0-01 0-005 
A=19 B=17| 15 | 8 -0s0-| 7 023 | 6 -010-| 5 004 | A=19 B=14| 16] 7 042 | 6-017 | 5 -006 | 4 -002 A=19 
14 | 6 -023 6 -023 5 -010-| 4 -003 15 | 6 -039 5 -o1s+| 4 -005+| 3 001 
13 | 6 -049 5 -022 4 -008 3 -003 14 | 5 -034 4 -013 3 -004 3 -004 
12 | 5 -04s-| 4 -019 3 -007 2 -002 13 | 4 -027 3 -009 3 -009 2 -003 
11 | 4 -039 3 015+ | 2 -005-| 2 -005- 12 | 3 -020 3 -020 2 -006 1 -001 
10 | 3-032 | 2-o1 1 -003 1 -003 11 | 3 -040 2 -013 1 -003 1 -003 
9| 2 -024 2 024 1 -007 0 -001 10 | 2 -027 1 -007 1 -007 0 -001 
8 | 2 -047 1 -015-}| © -002 0 -002 9} 1 -015-| 1 -015s-}| O -002 0 -002 
7) 1 031 0 -006 0 006 | — 8 | 1 -030 0 -005+| 0 -005+ | — 
6} 0-014 | 0 014 | — — 7} 0-012 | O -o12 | — pee 
§5| 0-031 | — ea — 6] O -024 0 024 | — — 
16| 19 | 12 -03s-| 11 -013 | 10 -005- | 10 -o0s- oa} Poulan toes r 2 
18 | 10 -024 | 10 -024 9 -o10-| 8 -004 13} 19 | 9 -020 9 -020 8 -006 7: 
17 | 9 -031 8 -013 7 005+ | 6 -002 18 | 8 -029 7 010+ | 6 -003 6: 
16 | 8 -035-| 7 -015+| 6 -006 5 -002 17 | 7 -031 6 011 5 -004 5: 
15 | 7 -036 6 015+ | 5 -006 4 -002 16 | 6 -029 5 -o11 4 -003 4- 
14 | 6 -034 5 014 4 -00s+| 3 -002 15 | 5 -o25s+| 4 -009 4 -009 3 - 
13 | 5 -031 4 -013 3 -004 3 -004 14} 4 -020 4 -020 3 -006 2 : 
12 | 4 027 3 -010-| 3 -010-| 2 -003 13 | 4 -041 3 -015-| 2 -004 2: 
11 | 3 -021 3 -021 2 -007 1 -002 12 | 3 -029 2 -009 2 -009 1 - 
10 | 3 -042 2 -015-| 1 -004 1 -004 11 | 2 -o19 2 -019 1 -oos-| 1: 
9| 2 -030 1 -009 1 -009 0 -001 10}; 2 -036 1 -010-} 1 -o10-}| O- 
8 | 1 -o18 1 -018 0 -003 0 -003 9} 1 -020 1 -020 0 -003 0: 
7 1 -037 0 -007 0 007 | — 8 1 -038 0 -007 0 007 | — 
6/| 0-017 | 0-017 | — — 7) 0 -015-| O -o15-}] — _ 
5| 0-036 | — oe = 6} 0 030 | — a wp 
15} 19 | 11 -029 10 -011 9 -004 9 -004 12} 19 | 9 -049 8 -016 7 -00s-| 7- 
18 | 10 046 | 9-019 | 8 -007 | 7 -002 18 | 7 022 | 7-022 | 6 007 | 5: 
17 | 8 -023 | 8 -023 7 009 | 6 -003 17 | 6 -022 6 -022 5 007 | 4: 
16 | 7 -025-| 7 -025-| 6 -o10-| 5 -003 16} 5 019 5 -019 4 -006 3 - 
15 | 6 -024 6 -024 5 -009 4 -003 15 | 5 042 4 015+} 3 -004 3 - 
14| 5 022 | § -022 4 -008 3 -002 14] 4 -032 3 -011 2 -003 y 
13 | 5 -04s+| 4 -018 3 -006 2 -002 13 | 3 -023 3 .023 2 -006 1- 
12 | 4 -037 3 014 2 -004 2 -004 12 | 3 -043 2 -014 1 -003 1 - 
11 3 -029 2 -009 2 -009 1 -002 11 | 2 -027 1 -007 1 -007 0: 
10} 2 020 2 -020 1 -00s+ | O -001 10 | 2 -0s0-} 1 -014 0. -002 0- 
9| 2 -039 1 011 0 -002 0 -002 9] 1 -027 0 -oos-}| O -005-| 0: 
8} 1 -023 1 -023 0 -004 0 -004 8 | 1 .-050-| 0 -010-| O -o10- | — 
7) 1 046 0 -009 0 009 | — 7| 0-019 0 019 | — Ae. 
6 | 0 -020 0 020 | — — 6] 0-037 | — — —_ 
5| 0 042 | — — ze 
11} 19] 8 -o41 7 -012 6 -003 6 -003 |4 
14| 19 | 10 -024 | 10 -024 9 -008 8 -003 18 | 7 -047 6 -016 5 -004 5 -004 
18 | 9 -037 8 -014 7 -005-| 7 -00s- 17 | 6 -043 5 -015-| 4 -004 4 004 
17 | 8 -042 7 -017 6 -006 5 -002. 16 | 5 035+] 4 -012 3 -003 3 -003 
The table shows: 
(1) In bold type, for given A, B and a, the value of b (<a) which is just significant at the probability level : 
quoted (single-tail test). 





(2) In small type, for given A, B and r=a+b, the exact probability (if there is independence) that 6 is equal to | 
or less than the integer shown in bold type. 


—_—_ 











82228 


e228 


| | oor KHNwWwWw 
nN 


| 


28 


£8 


82288 


4 


~uN 
gs82222228 


ee 
1 


° 
8 


£8 





Significance tests in a 2 x 2 contingency table (continued) 




















ity level 


equal to 





—o 






































Probability Probability 
a a 
0-05 0-025 0-01 0-005 0-05 0-025 0-01 0-005 
A=19 B=11| 15 | 4 -027 3 -008 3 -008 2 002 | A=19 B=7 | 19:| 4 -013 4 -013 3 -002 3 -002 
14 | 3 -018 3 -018 2 -005-| 2 -005- 18 | 4 -047 3 010+} 2 002 | 2 -002 
13 | 3 -03s+| 2 -o10+} 1 -002 1 -002 17 | 3 -028 2 -006 2 -006 1 -001 
12 | 2 -021 2 -021 1 -00s-| 1 -oos- 16 | 2 -014 2 014 1 -002 1 -002 
11 | 2 -040 1 010+ | © -001 0 -001 15 | 2 -028 1 -005+| 1 -00s+| 0 -001 
10 | 1 -020 1 -020 0 -003 0 -003 14| 1 -o1 1 -o11 0 -001 0 -001 
9] 1 -037 0 -006 | 0 006 | — 13 | 1 -021 1 -021 0 -003 | O -003 
8 | O -013 0 013 | — te 12 | 1 -037 | © -005+| QO -oos+| — 
7} 0 -025-| 0 -o2s-}| — — 11 | 0 -010-| 0 -010-| O -o1o-| — 
6| 0 046 | — aie sae 10 | 0 -017 0 017 | — _ 
9] 0 030 | — “= Eee 
10} 19 | 7 033 | 6 009 | 6 -009 | 5 -002 8 | 0 048 | — —_ — 
18 | 6 -036 5 011 4 -003 4 -003 
17 | 5 030 | 4-009 | 4-009 | 3 -002 6 | 19 | 4 -050-| 3-009 | 3-009 | 2 -o01 
16 | 4-022 | 4 -022 3 -006 | 2 -oo1 18 | 3 -031 2 005+ | 2 -oost+| 1 -001 
15 | 4 -047 3 -015-| 2 -004 | 2 -004 17 | 2 015+} 2 -o15+| 1 -002 1 -002 
14 | 3-030 | 2 -co8 2 -008 1 -002 16 | 2 -032 1 -006 1 -006 | O -000 
13 | 2 -017 2 -017 1 -004 1 -004 15 | 1 -012 1 -012 0 -001 0 -001 
12 | 2 -033 1 -008 1 -008 0 -001 14 | 1 -023 1 023 | O -003 0 -003 
11 | 1 -016 1 -016 0 002 | O -002 13 | 1 -039 | © -o0s+| © -oos+| — 
10 | 1 -029 0 -005-| QO -005-| OQ -00s- 12 | 0 -010-| 0 -o10-| O -o10-| — 
91] O -009 0 -009 0 009 | — 11 | O -017 0 017 | — a 
8 | O 018 0 -o18 | — = 10 | O 028 | — ee os 
7| 0 032 | — se _ 9} O 045+} — a aes 
9 | 19 | 6 026 5 -006 5 006 | 4 -001 5 | 19 | 3-036 | 2 -oos-| 2 -o0s-| 2 -00s5- 
18 | 5 026 | 4-007 | 4-007 | 3 -001 18 | 2-018 | 2-018 | 1 -o02 | 1 -002 
17 | 4 -020 4 -020 3 -00s-| 3 -00s- 17 | 2 -042 1 -006 1 -006 | O -000 
16 | 4 -044 3 -013 2 -003 2 -003 16 | 1 -014 1 014 | O -001 0 -001 
15 | 3-028 | 2-007 | 2 -007 1 -001 15 | 1 -028 | 0 003 | 0 -003 | O -003 
14 | 2 -015-| 2 -015-| 1 -003 1 -003 14} 1 047 | 0 006 | 0 006 | — 
13 | 2 -029 1 -006 1 -006 | 0 -001 13 | O -o11 0 01 | — eS 
12} 1 -013 1 -013 0 -002 0 -002 12 | 0 019 0 019 | — — 
11 | 1 -024 1 024 | 0 004 | O -004 11 | 0 030 | — — _ 
10 | 1 042 | 0 007 | 0 007 | — 10 | 0 047 | — — _ 
9} 0-013 0 013 | — — 
8 | O -024 0 024 | — as 4119] 2 -02 2 -024 1 -002 1 -002 
7| 0-043 | — — —_ 18 | 1 -009 1 -009 1 009 | O -001 
17 | 1 -021 1 -021 0 -002 0 -002 
8 |19| 5-019 | 5-019 | 4-004 | 4 -004 16 | 1-040 | © 004 | 0 004 | O -004 
18 | 4 -017 4 017 3 -004 3 -004 15 | 0 008 | 0 008 | 0 008 | — 
17 | 4 -044 3 -011 2 002 | 2 -002 14 | 0-014 | 0 014 | — = 
16 | 3-027 | 2-006 | 2-006 | 1 -001 13 | 0 024 | 0 024 | — —_ 
15 | 2 -013 2 -013 1 -002 1 -002 12} 0 037 | — ts — 
14| 2 027 1 -006 1 -006 0 -001 
13 | 2 -049 1-011 | 0 001 | O -001 3 119] 1-013 | 1-013 | 0-001 | O -001 
12 | 1 -021 1 -021 | 0 003 | O -003 18 | 1-038 | 0 003 | 0 -003 | O -003 
11 | 1-038 | 0 006 | 0 006 | — 17 | 0 006 | 0 006 | 0 006 | — 
10 0 -011 0 -o11 a ins 16 0 -013 0 -013 —_ 
9 | 0 -020 0 020 | — x 15 | O -023 0 023 | — _— 
8 | 0 034 | — ‘dhe a 14) 0 036 | — —_— 







































































Significance tests in a 2x 2 contingency table (continued) 
——|___— 
Probability Probability 
a a 
0-05 0-025 0-01 0-005 0-05 0-025 0-01 0-005 
A=19 B=2 | 19 | 0 -005-| 0 -005-| 0 -oo5s-| O -00s-|] A=20 B=18)| 15 | 7 -027 6 -012 5 -004 5 04 | A=20 I 
18 | 0 -014 0 014 | — —_ 14 | 6 -026 5 011 4 004 | 4 00% 
17 | 0-02 | — _ — 13 | 5 024 | 5 024 | 4 -009 3 -003 
16 | O%o48 | — _ — 12} 5 047 | 4-020 | 3 007 | 2-m 
11 | 4 -041 3 016 | 2 -oos+] 1 -on 
10 | 3-033 | 2 -012 1 -003 1 -003 
A=20 B=20| 20 | 15 024 | 15 -024 | 13 -004 | 13 -004 9| 2-024 | 2-024 | 1 007 | O 001 
19 |14 -046 | 13 -022 | 12 -o10-| 11 -004 8 | 2 048 | 1 o15-| 0 003 | 0 0% 
18 | 12 -032 11 -015+ | 10 -007 9 -003 a 1 -031 0 -006 0 -006 —_— 
17 | 11 041 | 10 020 | 9 -009 8 -004 6| 0-014 | 0-014 | — whe 
16 |10 048 | 9 024 | 7 -00s-| 7 -os- 5| 0 031 | — _ _— 
15 | 8 -027 7 012 | 6 -o0s+| 5 -002 
14 | 7 -028 6 -013 5 -oos+| 4 -002 17| 20 | 13 036 | 12 -o14 | 11 -005+ | 10 -002 
13 6 -028 5 -012 4 -005-| 4 -005- 19 | 11 -026 10 -o11 9 -004 9 -004 
12} 5 027 | 4 011 | 3 004 | 3 -004 18 |10 034 | 9 -o1s-| 8 -006 | 7 -o02 
11 4 -024 4 024 3 -009 2 -003 17 9 -038 8 -017 7 -007 6 -003 
10 | 4 -048 3 -020 2 -007 1 -002 16 8 -040 7 -018 6 -007 5 -003 
9 3 -041 2 o1s+| 1 -004 1 -004 15 7 039 6 -017 5.007 4 002 
8 | 2-032 | 1 o10-| 1 -010-| 0 -002 14| 6 037 | 5-016 | 4-006 | 3-02 
7 1 -022 1 -022 0 -004 0 -004 13 5 033 4 -013 3 -005-| 3 -005- 
6 1 -046 0 010+ | — _ 12 4 -028 3 -o10+ | 2 -003 2 -003 
5 | 0 024 | 0 0% | — _ 11 | 3-022 | 3-022 | 2-007 | 1 2 
10 | 3 042 | 2 015+} 1 -004 1 -004 
19| 20 | 15 -047 | 14 020 | 13 -o08 | 12 -003 9} 2-031 1 -009 1 -009 | O -001 
19 | 13 -039 | 12 -o18 | 11 -008 | 10 -003 8} 1-019 | 1-019 | O 003 | O 003 
18 | 11 026 |10 012 | 9 -00s-| 9 -oos- 7 | 1.037 | O 008 | O 008 | — 
17 | 10 -032 9 .015- 8 -006 7 -002 6 0 -017 0 -017 _— —_ 
16| 9-036 | 8-017 | 7 007 | 6 -003 5| 0 036 | — _ _— 
15 | 8 038 7 018 6 -008 5 -003 
14 | 7 -039 6 -018 5 -007 | 4 -003 16] 20 | 12 -031 | 11 -012 | 10 004 | 10 -004 
13 | 6 -038 5 017 4 -007 3 -002 19 | 11 -o49 | 10 -021 9 -008 8 -003 
12 | 5 -03s+| 4 -015+| 3 -o0s+| 2 -002 18 | 9 -026 8 -o11 7 004 | 7 -004 
11 | 4 -031 3 012 | 2 004 | 2 -004 17} 8 028 | 7 -012 | 6 004 | 6 -004 
10 | 3 026 | 2 -009 2 -009 1 -002 16 | 7-028 | 6 -012 5 004 | 5 -004 
9} 2-019 | 2 019 1 -o0s*+| © -001 15 | 6 026 | § 011 4 004 | 4 -004 
8 | 2 -039 1 012 | 0 002 | O -002 14} 5 -023 5 -023 4 -009 3 -003 
7 | 1 026 0 -o0s+| G -oos+ | — 13 | 5 046 | 4 -019 3 -007 2 -002 
6| 0-012 | O -o12 | — a 12 | 4 038 3-014 | 2 004 | 2 -004 
5| 0-027 | — —_ _ 11 | 3 -029 2 -010-| 2 -010-| 1 -002 
10 | 2-020 | 2 -020 1 -00s+| 0 -001 
18| 20 | 14 041 | 13 017 | 12 -007 | 11 -003 9| 20399 | 1-o1n | O 002 | O om 
19 | 12 -032 | 11 -014 | 10.006 | 9 -o02 8} 1023 | 1-023 | O 004 | O 0% 
18 | 11 -043 | 10 020 | 9 008 | 8 -003 7| 1 045+} 0 009 | O 009 | — 
17 | 10 -oso-| 9 -024 | 7 -004 | 7 -004 6| 0-020 | 0 020 | — in 
16| 8 026 | 7 -o 6 -005-| 6 -005- 5| 0 04 | — — a 
The table shows: 
(1) In bold type, for given A, B and a, the value of 6 (<a) which is just significant at the probability level 
quoted (single-tail test). 
(2) In small type, for given A, B and r=a+b, the exact probability (if there is independence) that 6 is equal to 
or less than the integer shown in bold type. 














SSSSESSEEE 
WHCONNAMOOS | | | 


0-005 





SesSsssFssss 
ONFTOMANK HH OOS | | 


0-01 


1 
S858 8538 
ONEMANMMSSOS |! | | 





Probability 


0-025 





0-05 





7 042 





19 








SSSSSS3555338 
AMONnANANAK HOS |: | 


0-005 


! 1 1 
S888FFSS3S58 
SE SHEMANA HM OS® | | | 


i} aananeornr an 
82 ssSssggs 
oo wunwarsErNNA 
SAeryehs 
eet eet eet 
an 
én ok i 0h a na 
S8Ss8ssssssss 38 
~ONnAFAANAN | he) 


Stestiainaneg 





$25858828: 


SSSSsesesese8 88 





Significance teste in a 2 x 2 contingency table (continued) 














20 B= 


A 








po 
L saad » © Lad oon 
> | s SSSSSSSSISSIS 
oO 
8 \ \ | \ + 
° 2) 2 pie oo See ee He ee ee ee ee ee ee 4 = 2 
3 8 Sse8253S88 SSSssssSsSe3ss ss8ssssssses8s so 6 
S2ES283888 SSRRSSSSRERSSTRS ERRSSEERERERERR FE 
s 833323359898 SSSSSSSSSSESSESSS SSESSSSSESSESSESSSS 3 
SNOorontnne oO SAOr-Oontnn-OaAwor vo RAereesTarraSsaore on 
CNN ttt et et eet et et Cee eet et et ea i ee ee | a = 
” + a) Nn 
_— _ _— — 





SSSSsSSsssssss 
SAS OMNIMANAMOS | | | 


SSS 858 


2222888888583 | 


— 


lity level 








Significance tests in a 2 x 2 contingency table (continued) 














Probability Probability 
a a 
0-05 0-025 0:01 0-005 0-05 0-025 0-01 0-005 
A=20 B=9 | 13 | 2 -o4 1 009 | 1-00 | 0 0011 |A=20 B=6 | 14] 1 -032 004 | O 004 | O -0% 
12 | 1 018 1 -018 0 -002 0 -002 13 | O -007 0 -007 0 007 | — 
11 | 1 032 | © -00s-| 0 -005-| O -00s- 12| 0-013 | 0-013 | — 
10 | 0 009 | 0 -009 | Q 009 | — 11 | 0 022 | O 022 | — = 
9| 0 -017 0 017 | — — 10 | O -035s-| — —s a 
8} 0 02 | — et eee 
7 | 0 0so-| — ie ay 5 | 20] 3 -033 2 -004 2 -004 2 -004 
19 | 2 -016 2 -016 1 -002 1 om 
8 | 20) 5 617 5 -017 4 -003 4 -003 18 | 2 -038 1 -005+ | 1 -005+| 0 -00 
19 | 4-015-| 4 -015-| 3 -003 3 -003 17 | 1 -012 1 -012 0 -001 0 -001 
18 | 4 -038 3 -009 3 -009 2 -002 16 | 1 -023 1 -023 0 002 | O om 
17 | 3 -022 3 -022 2 -00s-| 2 -00s- 15 | 1-040 | 0 -005-| 0 -o0s-| O -005- 
16 | 3 -044 2 011 1 -002 1 -002 14} O -009 0 -009 0 009 | — 
15 | 2-022 | 2 -022 1 -004 1 -004 13} 0 o15-| O -o15- | — <a 
14 | 2 040 1 -009 1 -009 | O -001 12} 0-024 | 0 024 | — _ 
13 | 1 -016 1 -016 0 -002 0 -002 11 | 0 038 | — —_ _ 
12} 1 -029 0 -004 0 -004 0 -004 
11 1 -048 0 -008 0 cos | — 4120} 2 022 2 -022 1 -002 1 -002 
10 | O -014 0 014 | — as 19 | 1 -008 1 -008 1 -008 0 -000 
9] O 024 0 02 | — a 18 | 1 -018 1 -018 0 -001 0 001 
8} 004 | — sa ten 17 | 1 -035+| © -003 0 -003 0 -003 
16 | 0 007 | 0 007 | O 007 | — 
7 | 20) 4-012 4 -012 3 -002 3 -002 15 | O -012 0 012 | — — 
19 | 4 -042 3 -009 3 -009 2 -001 14} O -020 0 020 | — — 
18 | 3 -024 3 -024 2 -00s-| 2 -005- 13 | 0-031 | — —_ — 
17 | 3 0so-| 2 -o1 1 -002 1 -002 12| 0 047 | — —_— —- 
16 | 2 -023 2 -023 1 -004 1 -004 
15 | 2 -043 1 -009 1 -009 0 -001 3 | 20] 1 -o12 1 -012 O -001 0 -001 
14 1 -016 1 -016 0 -002 0 -002 19 | 1 -034 0 -002 0 -002 0 -002 
13 | 1 -029 0 -004 0 -004 0 -004 18 | O -006 0 -006 0 006 | — 
12 | 1 -048 0 -007 0 -007 | — 17 | O 011 0 -o1 | — —_ 
11 | O -013 0 013 | — a 16 | O -020 0 020 | — — 
10 | O -022 0 022 | — eld 15 | 0 032 | — _— — 
9| 0-036 | — os a 14| 0 047 | — — — 
6 | 20| 4 -046 3 -008 3 -008 2 -001 2 |20)] 0-004 | 0 -004 | 0 004 | O 004 
19 | 3 -028 2 -00s-| 2 -o0s-| 2 -00s- 19 | 0 -013 0 013 | — 
18 | 2 -013 2 -013 1 -002 1 -002 18 | 0-02 | — — —- 
17 | 2 -028 1 -004 1 -004 1 -004 17 | 0 043 | — — _ 
16 | 1 .-010-| 1 -o10-| 1 -010-| © -001 
15} 1 -o18 1 -018 0 -002 0 -002 1 | 20); O -o48 | — — _ 









































The table shows: 


(1) In bold type, for given A, B and a, the value of b (<a) which is just significant at the probability levg 
quoted (single-tail test). 


(2) In small type, for given A, B and r=a+b, the exact probability (if there is independence) that b is equal t 
or less than the integer shown in bold type. 














228 


003 


001 








bility levg 


is equal t 





[ 87 ] 


A METHOD FOR JUDGING ALL CONTRASTS IN THE 
ANALYSIS OF VARIANCE* 


By HENRY SCHEFFE, Columbia University 


A simple answer is found for the following question which has plagued the practice of the analysis of 

variance: Under the usual assumptions, if the conventional F-test of the hypothesis H: 4, = 4, =... = pW, 

at the a level of significance rejects H, what further inferences are valid about the contrasts among the 

4; (beyond the inference that the values of the contrasts are not all zero)? Suppose the F-test has k—1 
k 


k 
and v degrees of freedom. For any ¢, ...,¢, with & c,=0 write 0 for the contrast 2 c,u,, and write 
1 
6 and a for the usual estimates of 9 and the variance of 9. Then for the totality of contrasts, no 
matter what the true values of the 6’s, the probability is 1—« that they all satisfy 
6-805 <0<6+S65, 


where S? is (k—1) times the upper a point of the F-distribution with k—1 and v degrees of freedom. 
Suppose we say that the estimated contrast @ is ‘significantly different from zero’ if |8|>S#9. Then 
the F-test rejects H if and only if some 8 are significantly different from zero, and if it does, we cen say 
just which 6. More generally, the above inequality can be employed for all the contrasts with the 
obvious frequency interpretation about the proportion of experiments in which all statements are correct. 
Relations are considered to an earlier method of Tukey using the Studentized range tables and valid 
in the special case where the fi; all have the same variance and all pairs ?2,, i; (tj) have the same 
covariance. Some results are obtained for the operating characteristic of the new method. The paper 
is organized so that the reader who wishes to learn the method and avoid the proofs may skip §§ 2 and 5. 


1. STATEMENT OF THE METHOD 


The general problem is that of making inferences about the contrasts among a set of ‘true 
means’ or ‘true main effects’ ,, 2, ..., 4; in the analysis of variance. For example, the y; 
might be the true row effects in a two-way lay-out with possibly unequal numbers of 
observations per cell. The 1; may be unrestricted or subject to a single restriction of the form 


k 
Dhies =h, (1) 
k 
where the A; and h are known constants with 5 h,+0. A contrast is a linear function of 
the 4;, ‘ a 
6= py C; fis (2) 
determined by k known constants c, satisfying the condition 
k 
Le, = 0. (3) 


The value of the linear function for a particular set of , will be called the value of the contrast; 
it will not cause any confusion in the following to use the same symbol @ both for the contrast 
and the value of the contrast. 

We make the assumptions usual in the analysis of variance, namely, that there is at hand 
a set of statistics 7,,/,,...,2,, and @*, such that the 7; have a multivariate normal dis- 
tribution and are statistically independent of o?, that 

E(i;) =F (¢ =1, ---yk); 

and cov (f;,f4;) =a,;07 (i,j = 1,...,k), (4) 
where the constants a,; are known, and o* is unknown. The /; will always be “Model I’ 
(non-random) effects, as discussed by Eisenhart (1947) or Mood (1950). In a pure Model I 


* Work sponsored by the Office of Naval Research (U.S.A.). 








88 A method for judging all contrasts wn the analysis of variance 


situation, o? is the variance o? of a single observation (‘error variance’). In a mixed model 
situation where there exists an exact F-test of the hypothesis 


A: py = fg = --- = My (5) 


o* will equal o? plus further unknown non-negative parameters. In any case, G* is an 
estimate of o? with v p.F. (degrees of freedom), that is, v@#/o? has the x? distribution with 
vpD.F. The case where o? is known can be treated by obvious modifications of the theory 
below, usually merely by putting vy = co and G* = o* in the results. It is further assumed that 
if the u,; are unrestricted the rank* of the covariance matrix with elements (4) is k, and that 
if the w,; are subject to a restriction (1) then the 7; are subject to the same restriction (1) and 
the rank of the covariance matrix is k—1. 

The hypothesis H in (5), equivalent to the statement that all the contrasts are zero, can 
be tested by the conventional F-statistic with k—1 and vp.¥F. We shall refer to this test 
at significance level a as ‘the’ F-test of H. The problem of making further inferences about 
the contrasts, arising when the F-test rejects H, has been considered by various writers, 
including R. A. Fisher (1935), D. Newman (1939), J. W. Tukey (1951), and H. K. Nandi 
(1951). Except for Tukey’s and Nandi’s, the methods involve repeated tests of significance 
on the same data, and are hence subject to the usual objection that little is known about 
the joint operating characteristic. While it is often not possible in practical applications 
to avoid repeated tests of significance, it is possible for the particular problem we are 
considering. 

The solution studied in this paper is based on the following probability statement about 
the infinite totality of contrasts:+ For any contrast (2) denote its estimate by 6, 


a Lae 
G= Leib 


P k k 
the variance of 0 by 0%, oF == mses o°, (6) 


and the estimate of this variance by 63, 
k k 
B= Yayzc,c, 6%. 
° 2 Ps wr 
Define the positive constant S from 
S* = (k—1) F,(k—-1,»), (7) 


* The non-mathematical statistician may safely assume these rank conditions to be satisfied in 
practical applications; they are stated because they are needed later for the mathematical arguments 
in §§2 and 5. 

+ The idea of making an overall confidence statement for all the contrasts, its successful realization, 
and the resultant possibility of making valid tests of hypotheses suggested by the data, I first met in 
a lecture by Prof. J. W. Tukey on ‘New methods in the analysis of variance: Range in the numerator 
and range in the denominator’ at the Annual Princeton Conference of the American Society for Quality 
Control on 8 December 1951. This method was published (Tukey, 1951). The method of Tukey 
described in §3 was explained in ‘Allowances for various types of error rates’, an unpublished invited 
address presented before a joint meeting of the Institute of Mathematical Statistics and the Eastern 
North American Region of the Biometric Society on 19 March 1952 at Blacksburg, Va.; it differs 
from the published method (Tukey, 1951) only in the use of a root-mean-square instead of a range 
estimate of the error standard deviation. Recently when I communicated the method based on (8) 
to Prof. Tukey he wrote me that it had been familiar to him and Prof. D. B. Duncan for some time, 
and that they had discussed it publicly when he gave a lecture at Blacksburg in November 1951. 





6) 





HENRY SCHEFFE 89 


where F,(k — 1, v) denotes the upper @ point of the F-distribution with k— 1 and yp.F. Then 
the probability is 1—« that the values @ of all the contrasts simultaneously satisfy 


6-86, <0<6+864, (8) 


no matter what the values of all unknown parameters. The proof of this statement will be 
given in §2. 

This result may be used for the interval estimation of all contrasts of interest, including 
any suggested by the way the observed means //; fall out. No matter how many contrasts 
are estimated by the method (8), the probability that all the statements thus made will be 
correct will be >1—a. 

The result may also be used to declare any estimated contrast ‘significantly different 
from zero’ or not, according as the corresponding interval (8) excludes 0 = 0 or not. More 
precisely, after selecting a set of coefficients c; subject to (3) and thus determining a contrast 
we make one of the following three statements (it will be convenient also to say we make the 
statement for the contrast as well as about its estimate): 

(i) 6 is not significantly different from zero, 

(ii) O is significantly different from zero and positive, 

(iii) 6 is significantly different from zero and negative. 

We make statement (i) if —8@3<4< S64, (ii) if O> SG, (iii) if 9< —SGs. The operating 
characteristic of this method is studied in §4. 

We warn the reader here that in the special case where all the 7; have the same variance, 
and all pairs /,, 7; (¢+j) have the same covariance, and where the only contrasts of interest 
are the $k(k — 1) differences u;—j,;, the method of Tukey described in §3 should be used in 
preference to the above, because the confidence intervals will then be shorter. An example 
of such a case may be found in Scheffé (1952). 


2. Proor OF THE METHOD 


The proof will be made with the aid of a linear transformation and other mathematical 
apparatus which will again be useful in §5 for proving results about the operating charac- 
teristic stated in §4. Although the coefficients of the transformation will be regarded as 
known in the mathematical discussion, they need not be computed for the practical 
applications. 

The dimension of the space of estimated contrasts 6, regarded as linear forms in 
indeterminates f,, is k—1. Under the assumptions we made about the rank of the 
covariance matrix (4), we may find a basis for this space such that the corresponding 
random variables 7, ....4,—, Will be statistically independent with equal variance 


3 = 0%g2, 
where C is some chosen positive constant. The choice of C for the purposes of §4 will be 


k 

discussed there; it does not matter at present. Let 9, = k-+ > 7;. Then between the 7; and 
1 

the 7; there will exist a non-singular linear relationship, 


A, =2 bi sf; (i = 1,...,&), (9) 











90 A method for judging all contrasts in the analysis of variance 
where the b;, are constants not depending on unknown parameters. Writing H(9,) = ;, we 
get from (9) k 
rz = Bats (i = E, cabe the 
k k ok 
Then 8 = Yee, = DL Vegbys 75. 
i=1 i=1j=1 


The coefficient of 7; must be zero for all ¢,, ...,c, satisfying (3), so b,;, does not depend on 
i, and we may write 


k 
where d, = x 6; b;; j= 1,...,4—1). 
i=1 
k-1 k-1 
Now of = > dja, = C%0* > dj. (10) 
j=1 j=1 


A contrast 6 for which of = C?o* will be called a normalized contrast and denoted by 3. For 
a normalized contrast 


k k-1 
d= 2 Css = Bai (11) 
k-1 
we then have > @=1. 
1 


Clearly it suffices to prove that the probability statement associated with (8) holds for the 
totality of normalized contrasts. We shall do this by means of some simple geometric 
considerations. 

Let us introduce a (k—1)-dimensional space, which we shall call the y-space of points 
Y = (Yy,---»Yx-1), for graphing the parameter point or vector 7 = (7;, ..., 7,1), its estimate 


k-1 
f, and other quantities. The random variables > (4;—7,)?/(C?o*) and v@#/o? have inde- 
1 


pendent y? distributions with k-1 and vp.F., respectively, and so v/(k—1) times their 
quotient has the F-distribution. This yields the confidence sphere 7, 
k-1 


2X (ys Ii)? < C76", (12) 


for the parameter point 7, where S? is defined by (7). The probability is 1—a that the 
parameter point 7 is covered by /. 

Now anormalized contrast is uniquely determined by a coefficient vector d = (d,, ...,d,_) 
of unit length, and it will be convenient in this section and §5 to identify the contrast with 
the vector. It is seen from (11) that the value } of the contrast is the projection of 7 on d. 
Similarly, the value $ of its estimate is the projection of f on d. The interval (8) when written 


for # becomes 6-s0e <0<3+8C6. (13) 


If we lay this interval off on the vector d, its centre is the projection on d of the centre 7 of 
the sphere /, and its half-length is the radius of ; in other words, the interval (13) may be 
interpreted as the projection of Y on d. The interval covers the true value of the contrast 
if and only if the projection of the point 7 on d lies in the projection of Y ond. This 
happens for all vectors d if and only if Y covers 7, and the probability of this is 1—«a. 





we 


on 


10) 
or 


11) 


the 
tric 


nts 
ate 


de- 


heir 


12) 


the 


x-1) 
vith 
nd. 
[ten 


(13) 


A 


9 of 
y be 
rast 
This 





HENRY SCHEFFE 91 


3. COMPARISON WITH A METHOD OF TUKEY 


In the special case where all the #; have the same variance a,,0* and all pairs fi,, 7; (i+)) 
have the same covariance a,,0*, the following method of Tukey* is applicable. The prob- 
ability is 1—« that the values 0 of all the contrasts simultaneously satisfy 


6-Té<0<6+T6, (14) 
where the constant 7’ is defined as 


1 k 
T = 2 > | ¢; | (411 —@e)*, (15) 


and q is the upper « point of the Studentized range, for the range of a sample of k in the 
numerator, and vy D.F. in the denominator, that is, the upper a point of the quotient w/s, 
where w and s (s>0) are statistically independent, w is the range of a random sample of 
k standard normal deviates, and vs? has the y? distribution with v p.F. This has been tabled 
by J. M. May (1952) for « = 0-05 and 0-01, and needs to be tabled for a = 0-10. 

We propose to compare the efficiency of the two methods by use of the ratio R of the 
squared lengths of the confidence intervals (8) and (14), 


R = (8°63) /(T?6*). 


The motivation for using the squared lengths is that if for a particular contrast the value 
of R thus defined is Ry, we may say that for large samples the method (8) requires 2, times 
as Many measurements as (14) to give the same accuracy on this contrast. 


k 
After noting that OF = (44; —A49) 0? »» c3 


in the present case, we may express the ratio R as 


-$(b4) br) c 


The contrasts usually of the greatest practical interest are perhaps those consisting of the 
difference between the average of m of the w,; and the average of r of the other 4; (m+r<k); 
we shall symbolize this type of contrast by {m,r}. For instance, a contrast of the type {2, 3} is 


4 (Me + fz) — 3 (Ha + Mat Me)- (17) 
It may be shown that R attains its maximum value R,,,, for a contrast of the type {1, 1}, 


that is, a difference of two y,;, and its minimum value R,,, for one of the type {44, $4} if 
k is even, {4(k—1), $(k+1)} if k is odd, and hence 


Ruax. = 2(S*/9"), 
4k-1(S?/q?) for k even, 
Rom. = tg 1)-1 (S?/q2) for k odd. 

Table 1 shows how the relative efficiency of the two methods varies with k, the number of 
means, in the case vy = 00. The rows headed 1/R,,,, show the efficiency of method (8) relative 
to method (14) on contrasts which are differences of two ;, Ruin, shows the efficiency of 
method (14) relative to method (8) on some other contrasts. The value of S?/q? is also tabled 


* See footnote on p. 88. 








92 A method for judging all contrasts in the analysis of variance 


for use with (16) for the calculation of R. Table 2 shows that the value of R is not very 
sensitive to v. Of course any increase of S?/q? above its value for v = 00 listed in Table 1 
favours method (14). 


Table 1. Variation of relative efficiency R of two methods for different contrasts (v = 00) 











For k= 
a Value 
of 
2 3 4 5 6 8 10 13 16 20 
0-10 | S2/g? 0-50 | 0-55 | 0-60 | 0-64 | 0-69 | 0-78 | 0-86 | 0-98 | 1-09 | 1-23 
| 1/Rmax. | 1-00 | 0-91 | 0-84 | 0-78 | 0-73 | 0-64 | 0-58 | 0-51 | 0-46 | 0-40 
oy 1-00 | 0-82 | 0-60 | 0-54 | 0-46 | 0-39 | 0-34 | 0-30 | 0-27 | 0-25 
0-05 | S*/g? 0-50 | 0-54 | 0-59 | 0-64 | 0-68 | 0-76 | 0-85 | 0-96 | 1-07 | 1-20 
1/Rosx, | 1:00 | 0-92 | 0-84 | 0-79 | 0-73 | 0-65 | 0-59 | 0-52 | 0-47 | 0-42 
ili 1-00 | 0-82 | 0-59 | 0-53 | 0-45 | 0-38 | 0-34 | 0-30 | 0-27 | 0-24 
0-01 | S*/qg? 0-50 | 0-54 | 0-59 | 0-63 | 0-67 | 0-74 | 0-81 | 0-92 | 1-01 | 1-14 
1/Ruax. | 1:00 | 0-92 | 0-85 | 0-80 | 0.75 | 0-67 | 0-61 | 0-55 | 0-49 | 0-44 
bese: 1-00 | 0-81 | 0-59 | 0-52 | 0-44 | 0-37 | 0-33 | 0-28 | 0-25 | 0-23 


















































- k 3 6 10 13 | 16 20 
v 4 
NL 

0-05 5 0-56 0-70 0-89 1-03 1-15 1-31 
7 0-55 0-69 0-87 1-00 1-13 1-28 

10 0-55 0-69 0-87 0-99 1-11 1-26 

20 0-55 0-68 0-86 0-98 1-09 1-25 

40 0°55 0-68 0-85 0-97 1-08 1-23 

(oe) 0-54 0-68 0-85 0-96 1-07 1-20 

0-01 5 0-57 0-72 0-91 1-05 1-18 1-34 
7 0°55 0-69 0-87 1-00 1-12 1-27 

10 0-55 0-68 0-85 0-98 1-08 1-24 

20 0-55 0-68 0-84 0-96 1-07 1-21 

40 0-54 0-67 0-83 0-94 1-04 1-18 

co 0-54 0-67 0-81 0-92 1-01 1-14 
































The difference between the two methods is seen to increase with k; as k increases (8) gets 
relatively worse on the differences ~,—,;, while (14) gets relatively worse on some other 
contrasts (and gets worse faster). It is clear that the choice between the two methods in the 
special case where (14) is applicable depends on which kind of contrast we think will be of 
interest.* (It is, of course, not permissible to use both on the same data and then choose the 

* For the reader who has not skipped §2, a further comparison of the two methods is contained in 
the following loose but perhaps helpful statements: In using method (8) the user ‘pays’ for more than 
he will use. If he knew beforehand just which contrasts were going to interest him, the corresponding 
tangent planes to the confidence sphere Y of §Z, normal to the coefficient vectors d of the contrasts, 


could be constructed, and these would bound a circumscribed polyhedron #. The user pays for the 
information that the parameter point is in Y but he uses only the information that it is in #. In 








mt 





HEnrRyY ScHEFFE 93 


one with the results we like better—unless we are willing to settle for an overall confidence 
coefficient known only to be > 1— 2a, in which case we may choose for each contrast the 
shorter of the two intervals.) If we are interested exclusively in the differences u,—;, we 
should choose (14); if we are interested in many types of contrasts, and in investigating 
contrasts suggested by the data, (8) would seem superior. Tables 3a and 3b show the 
relative efficiencies in the cases k = 4 and k = 6 on most contrasts likely to be of practical 





























Table 3a. Relative efficiency of two methods Table 3b. Relative efficiency when 
when k = 4(a = 0-05, v = a) k = 6(a@ = 0-05, v = «&) 
Type of 
Type of contrast 1/R R contrast 1/R R 
{1, 1} 0-84 a, 1} 0-73 
{1, 2} 0-89 {1, 2} 0-98 
1, 3} 0-79 {1, 3} 0-91 
{2, 2}, quadratic 0-59 
Linear, cubic 0-74 {1, 4} 0-85 
{1, 5} 0-82 
{2, 2} 0-68 
{2, 3} 0-57 
{3, 3} 0-45 
Linear 0-59 
Quadratic 0-57 
Cubic 0-48 
Quartic 0-53 
Quintic 0-67 

















interest.* The type {m,r} is described above (17). The contrasts for linear, quadratic, etc., 
effects are the contrasts used in fitting orthogonal polynomials when the 4, correspond to 
equal steps of some independent variable. The coefficients c; for this type of contrast are 
listed in Fisher & Yates’s tables (1943). The relative efficiency of method (14) is given in the 
column headed R, of (8) in the column 1/R. It is seen that for k = 4, method (14) is superior 
only on the type {1, 1}, for k = 6 also the type {1, 2}. These tables are for vy = 00 and require 
some correction in favour of (14) for small values of v; the correction factor is the ratio of 
entries in Table 2 for the given v and for v = 00 at a = 0-05. 


principle it is possible to calculate an.#’ obtained by a uniform contraction of H about the centre of 
Y and to pay just for this; in practice this would be a hopelessly complicated calculation for most 
cases. In the special case where the /; have equal variances and equal covariances, and the contrasts 
of interest are the $k(k — 1) differences uw, —;, Tukey’s method does calculate the#”’. If #’ is available, 
we can of course infer about other contrasts by projecting #’ on their vectors d, as we projected /. 
This is equivalent to (14) for Tukey’s method, and we will see below that this does not work as well 
for many contrasts of interest. We suggest that it will often be worthwhile to the customer to pay for 
more than he will use if he does not know beforehand just what he will need and can buy a good 
package. 

* Interactions may also be regarded as contrasts. For example, if ,; is the true mean for the (i, 7) 
cell in a two-way lay-out, the two-factor interaction in ‘the (1, 2) cell is 44,.—/41.—.2+/.., where the 
dots have their usual connotation, and this is a contrast among the y,;. General superiority of (8) over 
(14) for the interaction contrasts is indicated by some numerical calculations. ; 














94 A method for judging all contrasts in the analysis of variance 


4. OPERATING CHARACTERISTIC OF THE METHOD 


For many purposes it may suffice to decide whether an experiment has adequate sensitivity 
by considering the lengths of the confidence intervals (8). In this section we consider the 
effects of the method from a viewpoint close to the Neyman-Pearson (1933) concept of the 
two kinds of error possible in hypothesis-testing. We preface this rather long development 
with a few remarks. 

The power of the method will turn out to seem low to one accustomed to power calculations 
for a single t-test.* This has its counterpart in estimation, in that for k> 2 our confidence 
interval (8) for a contrast will usually be much longer than the 100(1—«) % confidence 
interval for this contrast alone based on the t-distribution (which it is, of course, not valid 


Table 4. Values of t?/S? 















































k 2 3 4 5 6 8 10 13 16 20 
a v 

0-10 5 1-00 | 0-54 | 0-37 | 0-29 | 0-24 | 0-17 | 0-14 | 0-10 | 0-08 | 0-07 
7 1-00 | 0-55 | 0-39 | 0-30 | 0-25 | 0-18 | 0-15 | 0-11 | 6-09 | 0-07 
10 1-00 | 0-56 | 0-40 | 0-32 | 0-26 | 0-19 | 0-16 | 0-12 | 0-10 | 0-08 
20 1:00 | 0°57 | 0-42 | 0-33 | 0-28 | 0-21 | 0-17 | 0-13 | 0-11 | 0-09 
40 1-00 | 0-58 | 0-42 | 0-34 | 0-28 | 0-22 | 0-18 | 0-14 | 0-11 | 0-09 
io) 1:00 | 0-59 | 0-43 | 0-35 | 0-29 | 0-23 | 0-18 | 0-15 | 0-12 | 0-10 
0-05 5 1-00 | 0-57 | 0-41 | 0-32 | 0-26 | 0-19 | 0-15 | 0-12 | 0-10 | 0-08 
7 1-09 | 0-59 | 0-43 | 0-34 | 0-28 | 0-21 | 0-17 | 0-13 | 0-11 | 0-09 
10 1-00 | 0-61 | 0-45 | 0-36 | 0-30 | 0-23 | 0-18 | 0-14 | 0-12 | 0-09 
20 1-00 | 0-62 | 0-47 | 0-38 | 0-32 | 0-25 | 0-20 | 0-16 | 0-13 | 0-11 
40 1-00 | 0-63 | 0-48 | 0-39 | 0-33 | 0-26 | 0-21 | 0-17 | 0-14 | 0-12 
foe 1-00 | 0-64 | 0-49 | 0-40 | 0-35 | 0-27 | 0-23 | 0-18 | 0-15 | 0-13 
0-01 5 1-00 | 0-61 | 0-45 | 0-36 | 0-30 | 0-22 | 0-18 | 0-14 | 0-11 | 0-09 
7 1:00 | 0-64 | 0-48 | 0-39 | 0-33 | 0-25 | 0-20 | 0-16 | 0-13 | 0-10 
10 1-00 | 0-66 | 0-51 | 0-42 | 0-36 | 0-28 | 0-23 | 0-18 | 0-15 | 0-12 
20 1-00 | 0-69 | 0-55 | 0-46 | 0-39 | 0-31 | 0-26 | 0-21 | 0-18 | 0-14 
40 1-00 | 0-71 | 0-57 | 0-48 | 0-42 | 0-33 | 0-28 | 0-23 | 0-19 | 0-16 
9) 1-00 | 0-72 | 0-58 | 0-50 | 0-44 | 0-36 | 0-31 | 0-25 | 0-22 | 0-18 

i 





to apply if the choice of the contrast has been suggested by the configuration of the observed 
means). The ratio of the squared lengths of the confidence intervals, namely, ¢?/S?, where 
t is the upper $a point of the t-distribution with vp.F. and S? is given by (7), is listed in 
Table 4. While these ratios depart far from unity for k>2, the writer was somewhat 
surprised that the departure was not greater. For k = 6, for example, it requires only about 
three times as many measurements to get confidence intervals for as many contrasts as we 
please, including any suggested by the data, with 95 % confidence that all are correct, as 
for a 95 °% confidence interval for a single one of the contrasts selected before the data are 
examined, the confidence interval in both cases having the same expected length. 

* Those accustoyned in applied statistics to making repeated t-tests at the 5% significance level might 
consider choosing «= 10% rather than 5% with the new method (8): The user of the repeated 5% 
tests is working at some ‘overall’ significance level that is unknown but greater than 5%; perhaps he 


would be glad to settle for a guaranteed 10%. How bad the repeated ¢ method may get to be from the 
‘overall’ point of view is indicated below for confidence intervals in connexion with Tables 5 and 6. 





ty 
he 
he 
nt 


ns 
ce 
ce 
id 





HENRY ScCHEFFE 95 


Instead of considering the present method from the point of view of one accustomed to 
the method of repeated t-tests, or repeated confidence intervals based on the same data, it 
is instructive also to do the reverse. The theory of the method (8) permits the calculation 
of a lower bound for the overall confidence coefficient implied by the use of repeated 
t confidence intervals for the contrasts, calculated from the same data. Thus, for repeated 
95 % confidence intervals based on t we may calculate the bound by substituting in place 
of S in (7) the two-tailed 5 % point of t with vy p.¥., and solving the resulting equation for 
1—a. The values of this bound shown in Table 5 may startle the reader when he considers 
that by increasing the number of repeated ¢ confidence intervals, the overall probability 


Table 5. Lower bound for probability that all repeated t confidence intervals for 
contrasts will be correct (v = 00) 








aN ki 2 |3!]4i{s56i]6{s | 100] 13 | 16 20 

Conf. | 

coeff. Sy 
0-90 0-90 | 0:74 | 0-56 | 0-39 | 0-25 | 0-09 | 0-03 | 0-003) 0-092 |<0-041 
0-95 0-95 | 0-85 | 0-72 | 0-57 | 0-43 | 0-20 | 0-08 | 0-01 | 0-002 0-048 
0-99 0-99 | 0-96 | 0-92 | 0-84 | 0-75 | 0-53 | 0-32 | 0-12 | 0-03 | 0-004 









































Table 6. Lower bound in certain cases where repeated t confidence intervals are 
used only on differences m;—p;(v = 00) 


}. 









































ki 3 3 4 5 6 8 10 13 16 20 
| Conf. 
| coeff. 
0-90 0:90 | 0-77 | 0-65 | 0-53 | 0-43 | 0-28 | 0-17 | 0-08 | 0-04 | 0-01 
0-95 0:95 | 0-88 | 0-80 | 0-71 | 0-63 | 0-49 | 0-37 | 0-24 | 0-15 | 0-08 
{ 0-99 0-99 | 0-97 | 0-95 | 0-93 | 0-89 | 0-84 | 0-77 | 0-67 | 0-58 | 0-48 
{ 





may be brought arbitrarily close to this bound. If repeated ¢ confidence intervals are used 
only on the differences 4, —;, and if the 7; have equal variances and equal covariances, then 
the theory of Tukey’s method (14) leads to a better bound, found by solving for 1—« the 
equation g = 2#t).;, where q is the same function of «, k, v as in (15). This bound is attained 
if all 44(k—1) statements are made about the differences ~;,—j,;. Values of the bound are 
shown in Table 6. We may try to take comfort from the thought that if we always made all 
3k(k—1) statements based on 5 % t, we should in the long run of experiments have 95 % 
of all the statements we made correct, but we should remember that of the statements to 
which we are likely to pay the most attention more than 5 % tend to be wrong; for example, 
statements associated with the larger observed differences. 

The sensitivity of the new method is exactly the same as that of the F-test of the hypo- 
thesis H in (5) at significance level a in the following sense: If the F-test accepts the hypo- 
thesis H that all the contrasts are zero, the new method will make statement (i) for all 
contrasts, that is, say no contrast is significantly different from zero; if the F-test rejects 








96 A method for judging all contrasts in the analysis of variance 


the hypothesis, the new method will make statements (ii) and (iii) for some contrasts, that 
is, say they are significantly different from zero (and positive or negative, respectively). 
This may readily be seen* from the geometrical picture introduced in § 2. 

To formulate the problem of the power of the method, we consider the probabilities of 
making each of the statements (i), (ii) and (iii). We might think at first that what we would 
like is to make statement (i) for a contrast whose true value is zero, (ii) for one whose true 
value is greater than some assigned 9p, (iii) for one whose true value is less than — 0), and 
that what we want are the probabilities of the desired statements in each case. This formula- 
tion is at once seen to be nonsensical, because for any contrast 0, whose true value is positive 
there exist constants h’ and h” such that the true value of h’0, is <6 , the true value of 
h”0, is >0@5, and yet the same one of the statements (i), (ii) and (iii) will be made for all 
contrasts h0, with positive h. This difficulty can be avoided by laying our requirements on 
the suitably normalized contrasts; if we find what happens for the normalized contrasts, 
we will know what happens for all contrasts. The appropriate definition is that a contrast 
is normalized if the variance of its estimate is C?o?, where C is a specified constant; for 
normalized contrasts and their estimates we then write 3 and @ instead of 0 and 6. 

To see how the above difficulty is met, suppose the structure of an experiment is such that 
the contrast “4, — 2 is determined with greater precision than the contrast ~,—;. This is 
a property of the design which presumably was desired, and if we guarantee a certain 
probability P of detecting (that is, making statement (ii) for) a difference ~,—,. as great 
as 05, we should be satisfied with the same probability P of detecting a difference y,— ju; as 
great as a certain multiple of 6). The multiple assigned by the present method will later be 
seen to be the ratio of the standard errors of estimate of the two contrasts. Since this 
property may be expressed by saying that ‘the method assures the same probability P of 
detecting all normalized contrasts as great as a certain bound #,’, it indicates that we have 
a way of normalizing the contrasts which is suitable for this method. 

While this motivates our normalizing the contrasts so. that their estimates all have the 
same variance O2¢?, there is still the question of the choice of the constant C. Whether there 
is a satisfactory universal choice of C is dubious.t In any event, the question is of little 
practical importance, and we need not settle it in order to analyse the operating charac- 
teristic. If for the same experiment two statisticians choose C and C’, then equivalent 
choices of the bounds #, and 9, are related by 35 = (C’/C) dp. 

A simple example illustrating these considerations will be helpful. Suppose we wish to 
design an experiment in a ‘one-way lay-out’ with k = 6, that we wish to take twice as many 
observations on the means //;, /4g, M/s AS ON fly, 5, Mg, and that we are primarily interested in 
the contrasts which are differences 4; —,; (for equal numbers of observations per mean we 


* In the notation of §2, H may be written 7 = 0. Thus the F-test accepts H if and only if the con- 
fidence sphere covers the origin, and this is precisely the case where statement (i) will be made about 
all the contrasts. > 

+ A possible universal choice of C might be the following : transform from Yes eS See é, 
by an orthogonal transformation, such that £, is the fi, of §2, and ie heat oi 1 are statistically inde- 
pendent. The set E., « rie E,_ , then spans the space of the estimated contrasts. Let o{= = var (£;). Define 
C so that Co is the geometric mean of 6}, ..., 7%. C is then a function of the a,, in (6) alone. It may 
be shown that in the case of independent ry with equal variances o?/n this gives C = n-4; more generally 


if the f, are independent with respective variances o%/n, this gives C?#-) =7 II n,, where % = x n,[k. 


In the example below, this leads to C? = (3%s)t/n, which does not particularly Seneulenaics itself from 
the practical computational point of view. 





——— gy 





a 





HENRY SCHEFFE 97 


should then use Tukey’s method). These contrasts will then be determined with three kinds 
of precision, namely, the precision of those like w, — “2g, like ~,—,, and like w, — 4, the three 
standard errors of estimate being in the ratio 2! : 4! : 34. Suppose we are satisfied with a 
probability of P = 0-90 of detecting a difference , —u, as great as 1-00. We will then get 
the same probability 0-90 for a difference 1, — 1, as great as 1-00(4+/2+) = 1-40, or a difference 
[ty — 4g a8 great as 1-00(3!/2#) = 1-20. If for either of the last two differences this sensitivity 
is not considered sufficient then we should question whether we have chosen the correct 
design. 

Let n be the number of observations on /44, /4;, 4g, and 2n the number on /y, M2, 43. While 
the choice of the normalizing constant C does not matter much and might even be left 
indefinite, we shall suppose for concreteness in this example that the contrasts are normalized 
so that their estimates all have the same variance as that of ~,—/2, namely, o?/n. Since 
for any contrast 

6 


So. 13 
var (x ci) = (5E4+zq) o?/n, 
1 . 


\e 2 4 


k 
the coefficients in a normalized contrast ® = }\c;“,; must then satisfy the condition 
1 


Thus the normalized form of the contrast ~,—, is 2 (u,—5), and of “4, — fg, (%)* (uw, — M4). 
Below we shall find how to choose n so that the probability is 0-90 of detecting a difference 
of 4, — #2 equal to a given multiple of o, say 1-00 = 3). The probability will then be the same 
for the value 9, of the contrast 2-*(~,—j,) or the value #, of the contrast (%)#(u,— 4). 
This gives the statements in the last paragraph about the differences 4,—, and s, — js. 
In such an experiment the contrast consisting of the difference of the average of 1, /, Hs 
and the average of 4, /;, 4, might also be of interest. Its normalized form is 


3 6 
(6) (= M-% m) 
3 6 
and we find in a similar way that a difference of the averages > ,/3 and > y,/3 equal to 
1 4 


(4) ($)# 3 = 0-70 would also be detected with probability 0-90. 

In the following calculations we shall symbolize by ¢’(f; 6) a non-central ¢ variable with 
fp.¥. and non-centrality parameter 6, that is, a variable distributed as the quotient of z+é 
by f-'x, where z is a standard normal deviate and x is an independent y variable with f D.F. 

Statements (i), (ii) or (iii) are made for a contrast 6 according as be falls in the intervals 
(—S,8), (S,0o), or (— oo, — 8). But the variable 6/é; has the non-central t-distribution with 
vp.F. and parameter é = 0/04, where a is given by (6). It follows that the probability of 
each of these statements being made depends only on the true value of 6/04 for the particular 
contrast for which the statement is made, and on the constant S. 

If the true value of 6 is zero, the probability that we make the desired statement (i) is 


Pr{—S<t'(v: 0) <8}, 
where ?t'(v; 0) denotes a central ¢ variable with v p.r. This may also be written 


Pr{#(1,v)<(k—-1) F,(k—1,)}, 
Biometrika 40 > 














98 A method for judging all contrasts in the analysis of variance 


where F(1, v) is an F variable with 1 and vp.F. That this is at least 1 — @ is evident from the 
confidence statement associated with (8). Indeed, as k increases from 2, this probability 
increases rapidly from 1 — a as indicated in Table 7, where 1 minus this probability is tabled 
for the case v = 00. 


Table 7. Probability of not making statement (i) about a contrast whose 
true value is zero (v = 00) 








0-10 | 0:10 | 0-032 | 0-012 | 0-0053 | 0-0024 | 0-0°53 | 0-0°13 | 0-017 | 0-0523 | 0-0°18 
0-05 | 0-050 | 0-014 | 0-0052 | 0-0021 | 0-0388 | 0-0918 | 0-0439 | 0-0545 | 0-0°57 | 0-0740 
0-01 | 0-010 | 0-0024 | 0-076 | 0-0327 | 0-0910 | 0-0417 | 0-032 | 0-031 | 0-0732 | 0-0818 









































For any contrast @ the probability that we make statement (ii) is 
Pr {t’(v; 0/o,) > S}. (18) 
This is a strictly increasing function of the value of 6/03. Suppose now the contrast is 
normalized; we then write it as ? and have a3 = Co. Writing P for the probability (18) when 


3 = By, we have P = Pr{t’(v; 9/(Co)) > 8}. (19) 


From the tables of non-central / by Johnson & Welch (1940) we can find the value of 
6 = #,/(Co) for which P attains the values 0-01, 0-05, 0-1 (0-1) 0-9, 0-95, 0-99. 

The probability of making statement (iii) for any contrast can of course be calculated as 
the probability of making statement (ii) for its negative. 

In the example introduced above of the experiment with 6 means, C = n-*. If we 
take n = 10 and use significance level a = 0-05, then v = 84, S? = 5F'y.9,(5, 84) = (3-41)?, 
and we find from (19) and the Johnson & Welch tables that for P = 0-90, d = 4-72, so 
8, = 4:72Co = 1-490. This poor sensitivity is related to the extremely low risk (0-0009 in 
Table 7 for v = 00) of making the other kind of error, namely, calling significantly different 
from zero the estimate of a contrast whose true value is zero. We supposed in this example 
that we desired the n that would give a P = 0-90 for detecting a #= 1-00 = #5. Since for 
large v the parameter 6 found from the tables does not vary much with v we try for a first 
approximation for n the solution of $)/(Co) = (1-0) nt = 6 with the former 6 = 4-72. This 
gives n = 22. Calculation of #, from the tables for n = 21, 22, as above for n = 10, shows 
that n = 22 is the required value. In the same way we find that if we desire 3, = 2-00 for 
P = 0-90, a first approximation to the required n is n = (4-72/2-0)? = 6; use of the tables 
also gives n = 6. In all these calculations we took a = 0-05. 

Although the behaviour of the operating characteristic which we have now determined for 
any single contrast considered alone is of interest, it seems to the writer that in this method 
designed for judging all the contrasts it is of greater interest to determine the following two 
properties of the operating characteristic: First, the probability P, of the event @, that the 
desired statement (i) would be made for all the contrasts whose true values are zero, and secondly, 
the probability P, of the event &, that the desired statement (ii) would be made for all normalized 
contrasts whose values are >, (in which case the desired statement (iii) would also be made for 
all normalized contrasts whose values are < — 9). 








HENRY SCHEFFE 99 


We shall state here the results for the probabilities P, and P, and defer the proofs to the 
next section. It is obvious from (8) that if the hypothesis H in (5) is true, P, = 1—a. If 


H is false P, = Pr{F(k—2, v) <(k—1) (k—2) F,(k—1,)}, (20) 


where F(m, v) denotes an F variable with m and v p.F., and F,(m, v), its upper a point. It is 
interesting to note that P, depends only on whether H is true or false and not further on any 
unknown parameters. It is easy to see that if H is false P, > 1— a. Table 8 gives some values 
of 1—P,; we remark that for / = 2 there are no zero contrasts other than 0-, + 0: y, if H is 
false, and so 1—P, = 0 if H is false, trivially. We see now that this analogue of the 
Neyman-Pearson risk of a ‘type I’ error, namely, the probability 1 — P, of failing to make 
statement (i) for all zero contrasts, is the one the method controls in a satisfactory way. 
rather than the marginal probability for a single zero contrast considered above, which 
assumes the microscopic values indicated by Table 7. 


Table 8. Values of 1—P, if H is false 


| 
k 3 4 ee Stab 8 10 13 16 20 








0-10 5 0-040 0-056 0-065 0-070 0-077 0-082 0-086 0-088 0-091 

7 0-038 0-053 0-061 0-067 0-074 0-079 0-083 0-086 0-089 
19 0-036 0-050 0-058 0-064 0-071 0-076 0-081 0-084 0-087 
20 0-034 0-047 0-055 0-060 0-067 0-072 0-077 0-080 0-083 
40 0-033 0-046 0-053 0-058 0-065 0-069 0-074 0-077 0-080 
oe) 0-032 0-044 0-051 0-055 0-062 0-066 0-070 0-073 0-075 


0-05 5 | 0-019 | 0-027 | 0-031 0-034 | 0-038 | 0-040 | 0-042 | 0-044 | 0-045 

7 | 0-018 | 0-025 | 0-029 | 0-032 | 0-036 | 0-039 | 0-041 | 0-042 | 0-044 
10 | 0-017 | 0-024 | 0-028 | 0-031 0-035 | 0-037 0-040 | 0-041 0-043 
20 | 0-016 | 0-022 | 0-026 | 0-028 | 0-032 | 0-035 | 0-037 | 0-039 | 0-041 
40 | 0-015 | 0-021 0-025 | 0-027 | 0-031 0-033 | 0-035 | 0-037 0-039 
9) 0-014 | 0-020 | 0-023 | 0-026 | 0-029 | 0-031 0-033 | 0-035 | 0-036 


0-01 5 | 0-0036 | 0-0051 | 0-0060 | 0-0067 | 0-0074 | 0-0079 | 0-0084 | 0-0087 | 0-0089 

7 0-0033 | 0-0047 | 0-0056 | 0-0062 | 0-0070 | 0-0075 | 0-0080 | 0-0083 | 0-0086 
10 | 0-0030 | 0-0044 | 0-0052 | 0-0058 | 0-0066 | 0-0071 | 0-0076 | 0-0080 | 0-0083 
20 | 0-0027 | 0-0039 | 0-0047 | 0-0052 | 0-0060 | 0-0065 | 0-0070 | 0-0074 | 0-0078 
40 | 0-0026 | 0-0037 | 0-0044 | 0-0049 | 0-0056 | 0-0061 | 0-0066 | 0-0076 | 0-0074 
co 6| 00024 | 0-0034 | 0-0041 | 0-0045 | 0-0052 | 0-0056 | 0-0061 | 0-0063 | 0-0067 









































Let P max. denote the largest of the true values of all the normalized contrasts. If 9 > Pnax. 
the probability P, is trivially 1. If §,<@,,,, define the angle y from 


sin Y = Fo/P max: (21) 
Then it will be shown that in this case P, is the probability that 
(cot y) x, + (Sv-4 cosec y) x2 <2, + (Co) 3 cosec y, (22) 


where xj, X2, 2; are statistically independent, y, and y, are x variables with k— 2 and vD.r., 
respectively, and z, is a standard normal deviate. The probability P, does not seem exactly 
calculable from any existing tables, but an excellent approximation* may be obtained for 


* Johnson & Welch (1940) used a similar method to approximate non-central ¢. 








100 A method for judging all contrasts in the analysis of variance 


moderate or large values of v by replacing vy, by a normal variable z, with mean 1 and 
variance 1/(2v). A similar approximation would not work so well for x,, since its number 
k—2ofp.¥. might be quite small. This way we find P, approximately equal to the probability 


that z2+d2>t)(k—2)*,, 
where z = [1+ (2v)-! S? cosec? y]-! (z, — z, 8 cosec y + S cosec y) 
is a standard normal deviate and independent of x,; 
& = [1+ (2v)— S? cosec? y]-? [(Ca)-! 8, cosec y — S cosec y], (23) 
ty = [1 + (2v)-! S? cosec? y]-# (k — 2)! cot y. (24) 
To this approximation P, can thus be expressed in terms of the non-central t-distribution as 
Pr {t'(k—2; 8) >t}, (25) 


where é and ft, are given by (23) and (24). In the case where o is known we can put v = 00 
in (22), so that the left member becomes (cot y) x, +S cosecy; we may also put v = oo in 
(23) and (24), and the above approximation for P, then becomes exact. 

The exact probability P, and its approximation (25) depend on the parameter ?,,,, , Which 
enters (22) through y. As indicated in the example of the six means, we may be willing to 
specify #, beforehand as a multiple of a, say 


9, = Ao, (26) 
where A is a given constant, but it is unpleasant to discover then that P, still depends on 
the unknown parameter b = Pnax | (27) 


The dependence of the exact probability P, of (22) on y may be made clear by writing (22) as 

(Y2A~?— 1) X, + (V*SY/A) Xa + YC", (28) 
where the variables x,, x2, z,; have the distribution stated below (22). For y > A, P, is the 
probability of (28); for 0< <A, P, trivially equals 1. This problem did not arise when we 
obtained in (19) the corresponding marginal probability for a single contrast alone whose 
value = #, (the non-centrality parameter d there may be written A/C), as it does now when 
we consider the overall probability for all contrasts whose value is > ) (we could write = 3, 
here also). 

We suggest mecting this problem by using a lower bound P; for P,. We shall prove that 
P, = P,(y) is monotone in y, decreasing from the value 1 for y¥<A to a limiting value 
P,(co) as y increases from 0 to oo. This means that if we are willing to assume y does not 
exceed a known y,, we may use as the bound P; the value P,(y,), which may be accurately 
approximated by (23), (24), (25) with y = arcsin(A/y,), and if we are unwilling to set 
a bound y, for y, or if we are satisfied with a somewhat larger but much more easily 
calculated bound, we may use P, = P,(00), whose value will now be stated. 

The limiting value P,(00) is the probability that 

Xi tv *Sx2< A/C, (29) 
where x, and x, are distributed as in (22). This may be accurately approximated by the same 
method that led to (25), with the result that P,(0o) is approximately* 

Pr {i'(k— 2; 8) >to}, (30) 

* It is unfortunate in connexion with (30) and (25) that the Johnson & Welch tables (1940) do not 


go below 4 p.r., since 1, 2, 3 D.F. are needed for k = 3, 4, 5; it would be very desirable to have this 
extension of their tables. 








yr OS Re ue 


Ce 


) 


~~ ~ 


-~ © ®O ® 





HENRY SCHEFFE 101 
where 6 = (2v)# [A(CS)-- 1], (31) 
ty = (2v)t (k—2)4/S. (32) 


For very large values of v we may use the following simple approximation to P,(00), obtained 
by replacing vty, by 1 in (29): 
Pr{y, < AC-!—S}. 


To this approximation we may thus rapidly find for given # the A such that P,(0o) attains 
the value 1 — f for 3, = Ac by taking the square root of the upper f point from the x? tables 
for k—2D.¥F., say Xg;,-2, and computing 


A = C(S+Xp;4-2)- (33) 


A still rougher approximation might be obtained by replacing S by its value for v = oo, 
namely, X,;,—, to give 


A = C(Xa;n-1+Xp;k-2) (34) 


The approximation (34) becomes exact when a is known; this is obtained by replacing & by 
o and v by © in the calculations. 

To illustrate these results in the example we have carried along, we recall k = 6, C = n-*, 
a = 0-05. Suppose we wish a bound P3 for P, and decide to use P, = P,(00). If we take 
n = 10, then v = 84, and entering the Johnson & Welch tables (1940) with t) = 7-61 from 
(32) we find that a P,(0o) = 0-90 (to the approximation of (30)) is attained for 6 = 10-8, and 
this gives A = 1-98 from (31), that is, for #) = 1-980. The approximation (33) much more 
quickly gives A = 6-20C = 6-20n-? or A = 1-96. The approximation (34) gives A = 1-93. 
To find the n which gives a P,(00) = 0-90 for #, = 1-00, we first use the previous approxima- 
tion from (33), A = 6-20n-+, which is calculated with an S based on the now wrong v = 84, 
but then S is not sensitive to changes in v for large v. This gives nm = 38 for A = 1-0; the more 
correct formula (30) also gives n = 38. For most applications we would probably have to 
compromise on a larger A to get a more feasible n. In a similar way for n = 5, v = 39, 
a = 0:05, C = 5-+, we find first S = 3-50, then A = 2-81 from the approximation (33); the 
more accurate (30) gives A = 2-87. 


5. Derivation or PB, Py, ETC. 


The probabilities P, and P, can be found neatly by continuing the geometric approach 
of §2. Extending the notation there, denote by |7| the length of the vector 7, 


k-1 
|y |? = ~ 92. Since the value # of a normalized contrast is the projection of 9 on d. it is 


evident that = (35) 
Suppose for the moment 7+0 (equivalent to H false). If we imagine marking off the 
projection of 7 on d for each unit vector d to get a polar co-ordinate graph of the totality of 
values of the normalized contrasts, we see the graph consists of a sphere which has the 
vector 7 as a diameter. The zero-valued contrasts for which we desire to make statement 
(i) constitute the (k—2)-dimensional set of vectors d in the tangent plane F to the 
sphere at the origin. If 7 = 0 this picture collapses in an obvious way. 

Consider now the totality of normalized contrasts whose values are >@, for which we 
wish to make the statement (ii). There are none if | 4 | <p . If | 7 | > they fill a circular 
cone @ (by ‘cone’ we mean throughout one nappe of a cone) with axis along 7, vertex at the 








102 A method for judging all contrasts in the analysis of variance 


origin, and elements making an angle }7—+y with , where y is defined by (22). The cone 
¢' obtained by reflecting @ in the origin is filled by the normalized contrasts whose values 
are < —%p. 

In the same y-space the estimated values of the normalized contrasts have a similar 
graph, the diameter of the sphere being 7. From the interpretation of the confidence 
interval (13) for the value $ of any normalized contrast d as the projection of the sphere Y on 
d, it is easy to see that if | ?| <SCo the statement (i) is made for all the contrasts, while if 
| | > SCoé the normalized contrasts for which we make the statement (ii) fill a cone Z with 
axis along 9, vertex at the origin, and elements making an angle arccos (SC@/| |) with 
9, and the normalized contrasts for which we make the statement (iii) fill the reflexion 2’ 
of @ in the origin. 

If 7 + 0, the event &, of probability P, happens if and only if the projection of the sphere 
F on the plane F covers the origin. It is convenient now to rotate the axes in the y-space 
so that the vector y lies along the positive y,_,-axis. Denote the new coordinates of 4 
by :. aoe8 a and of 9 by ¢,,... 1, respectively. Then e. ved A @? will be independent, 
the é, will be normal with variance C?c?, the means ¢,,...,¢,_. will all be zero, while 
€.-1 = |7|. The plane 7 now has the equation y,.., = 0. The distance of the centre of 


k-2, \% 
F from the y;,_,-axis is ( p Ae ?) , and so the projection of Y on F will cover the origin if and 
1 
only if this distance does not exceed the radius of #, 


k-2, 4 
y, £2. < 820292, 
1 


k-3, 
or > Si /(k— 2) 
ni aR 
Since the random variable on the left side of the latter inequality has the F-distribution 
with k—2 and yp.F., we have now succeeded in deriving (20). 

If | 7| >, the event &, happens if and only if the fixed cone @ lies entirely inside the 
random cone Y. It may be verified that this is equivalent to the confidence sphere S lying 
entirely within a cone F with axis along 7, vertex at the origin, and elements making with 
9 an angle y defined by (21). The sphere Y will touch the cone F on the inside if the distance 
of its centre from the y,_,-axis plus sec y times the radius of the sphere equals ¢,_, tan y. 
Thus the event &, happens if and only if 


k-2, \4 4 
( z é) +CSo@secy <¢,_, tan y. 
I 


<(k—1)(k—2)7- F(k—1,). 


If we now divide this inequality through by Co tan y we get the desired result for P, stated 
in connexion with (22). 

The monotone behaviour of P,(y) stated in §4 was suggested by the above geometric 
picture of the sphere Y and the cone F, and may be established by geometric arguments. 
It will be simpler to give here an analytic proof utilizing the inequality (22). Denote by 
f(y) the conditional probability. given x2, that this inequality is satisfied. Clearly it will 
suffice to show that for all y, > 0, f(y) is a monotone decreasing function of y for y > A, or 
since cosec y = ¥/A, that the derivative f’(y) is positive. 

Let us write the case of equality in (22) as 


X1 = 2, tany+ Bsecy, (36) 
where B= AC-—v-*8y,. 





rill 


36) 





Henry ScuHerrté 103 


Because of the statistical independence of y,, X2,2,, the conditional probability f(y) may be 
found by treating x, as a constant and working with the joint distribution of x, and z, in the 
21, X;-plane: f(y) is the amount of probability in this plane above the z,-axis and below the 
line (36). We shall drop the subscripts from y, and z, in the rest of this calculation. The 
z-intercept of the line (36) is — Bcosec y, and the probability f(y) may thus be expressed as 
the integral 


fo)= [7 pale) ales nde, 


 Seeaiee 


where g(z,v) = Po(x) ax, 


and p,(z) and p,(y) are the densities of a standard normal deviate and of a y variable with 
k—2D.¥., respectively, and do not depend on y. Now 


f(y) = —p,(— Beosec y) g( — B cosec y, y) 6( — B cosec y)/dy +{" Pi (2) [0g(z, y)/Oy] dz, 
— Bcosecy 


and ag(z, y)/ey = po(z tan y + Bsec y) A(z tan y + Bsec y)/dy 
= (zsec*y + Bsecy tan y) p,(z tan y + Bsecy). 


But 9(— Beosecy, y) = 0, and hence 


f'(y) -{ (zsec? y + Bsec y tan y) p,(z) p,(z tan y + Bsec y) dz. (37) 
— Bcosec y 


If B< 0 the integrand is > 0 on the range of integration and then clearly f’(y)>0. If B>0 
we transform the integral (37) by the substitution w = zsecy + Btany to get 


f'(y) -| p(w cosy — Bsin y) p,(wsin y + B cos y) dw. 
—Bcecoty 


By utilizing the explicit forms of the densities p, and p, this may be written 


f'(y) = | v(w) dw, (38) 
—Beoty 
where v(w) = D[u(w)]*-3 w e-*, 
D is a positive constant, and 
u(w) = wsiny+ Boosy (39) 


is positive for z > — Bcoty. If we break the range of integration in (38) up into the intervals 
(—Bceoty,0), (0, Beoty), (Bcot y,co), and drop the last, where v(w) > 0, we get 


Becoty 
sony> [o" “o(— 1) + (uy dw (40) 


For 0<w< Beoty, u(—w) is seen from (39) to be < u(w), hence | v( — w) | < v(w) since k > 3, 
and so the integrand in (40) is > 0 and therefore f’(y) > 0. 

To calculate the limiting value P,(0o) we revert to the geometric picture. To make y > 00 
we may let | 7 |->0o with fixed o. For | 7 | >, we recall that P,(y) is the probability that 
the sphere. lies inside the cone ¥. The cone F may be constructed by circumscribing a cone 
with vertex at the origin about a sphere 7 with centre at | 7 | on the y,_,-axis and radius 9. 
As || ->00 this picture becomes indeterminate. However, we clearly get the same result 
if we hold the sphere 7 and the joint distribution of z nS oe ,, & fixed and let the vertex 











104 A method for judging all contrasts in the analysis of variance 


of the circumscribed cone F go to —oo on the y,_,-axis. The limiting figure for the cone 
F is now the cylinder circumscribed about 7 with elements parallel to the y,_,-axis. The 
probability P,(oc) that the sphere fall in this cylinder is the probability that 


k-2,\3 
( > é) + O88 < 
1 
Division of this inequality by Co yields the desired (29) and concludes the derivations. 


The writer is indebted to Mr Seiji Sugihara and Mr Judah Rosenblatt for making the 
numerical calculations. Tables of Merrington & Thompson (1943) were used in the 
calculation of Tables 1, 2, 4 and 8, tables of Pearson & Hartley (1942) for Tables 1 and 
6, tables of May (1952) for Table 2, tables of Merrington (1942) for Table 4, tables of 
Hartley & Pearson (1950) for Table 5, tables of Thompson (1941) and the W.P.A. tables 
(1942) for Table 7, and tables of Karl Pearson (1934) for Table 8. 


REFERENCES 


E1sEnHART, C. (1947). The assumptions underlying the analysis of variance. Biometrics, 3, 1-21. 

FisHer, R. A. (1935). The Design of Experiments, §24. Edinburgh: Oliver and Boyd. 

FisHer, R. A. & Yates, F. (1943). Statistical Tables, Table 23. Edinburgh: Oliver and Boyd. 

Harttey, H. O. & Pearson, E. 8. (1950). Tables of the x?-integral and of the cumulative Poisson 
distribution. Biometrika, 37, 313-25. 

JoHNsON, N. L. & WeEtcuH, B. L. (1940). Applications of the noncentral t-distribution. Biometrika, 31, 
362-89. 

May, Joyce M. (1952). Extended and corrected tables of the upper percentage points of the 
‘Studentized’ range. Biometrika, 39, 192-3. 

MERRINGTON, M. (1942). Table of percentage points of the t-distribution. Biometrika, 32, 300. 

MERRINGTON, M. & THompson, C. M. (1943). Tables of the percentage points of the inverted beta (F’) 
distribution. Biometrika, 33, 73-88. 

Moon, A. M. (1950). Introduction to the Theory of Statistics, §§14-9-14-11. New York: McGraw-Hill. 

Nanpi, H. K. (1951). On the analysis of variance test. Bull. Calcutta Statist. Ass. 3, 103-14. 

Newman, D. (1939). The distribution of range in samples from a norma! population, expressed in 
terms of an independent estimate of standard deviation. Biometrika, 31, 20-30. 

NeymMan, J. & Pearson, E. 8. (1933). On the problem of the most efficient tests of statistical hypo- 
theses. Phil. Trans. A, 231, 289-337. 

Pearson, E. 8. & Hart iey, H. O. (1942). The probability integral of the range in samples of n observa- 
tions from a normal population. Biometrika, 32, 301-10. 

Pearson, K. (1934). Tables of the Incomplete Beta-Function. London: Biometrika Office. 

Scuerré, H. (1952). An analysis of variance for paired comparisons. J. Amer. Statist. Ass. 47, 381-400. 

TuHompson, C. M. (1941). Table of percentage points of the x?-distribution. Biometrika, 32, 187-91. 

Tukey, J. W. (1951). Quick and dirty methods in statistics. Part II. Simple analyses for standard 
designs. Proc. Fifth Annual Convention, Amer. Soc. for Quality Control, pp. 189-97. 

W.P.A. TaB es (1942). Tables of Probability Functions, vol. 2. Washington, D.C.: Nat. Bur. Standards. 





oO 


\f 





[ 105 ] 


THE ESTIMATION AND COMPARISON OF STRENGTHS OF 
ASSOCIATION IN CONTINGENCY TABLES 


By A. STUART 
Division of Research Techniques, London School of Economics 


1. INTRODUCTION 


In testing the significance of an observed association between two characteristics in a con- 
tingency table, x is generally used. When, however, we wish to measure strength of associa- 
tion, some text-books recommend the use of Karl Pearson’s coefficient of contingency, 
a function of x? which may be referred to the conventional range (—1, + 1), although it 
cannot attain the limits of this range. Kendall (1948a, chapter 13) gives references to the 
work on the coefficient, largely by Karl Pearson himself, and points out the difficulty of 
obtaining its sampling variance on the hypothesis of independence. 

Yates (1948) proposed a test based on scores for association of characteristics of the kind 
considered in this paper, but its distribution, as he points out, is only obtained on the hypo- 
thesis of independence, and it becomes progressively more inaccurate as the degree of 
association increases. Williams (1952) has recently considered various tests based on scores. 

In this paper, a coefficient is proposed which is independent of scoring systems. It is, in 
fact, a suitably modified form of Kendall’s rank correlation coefficient, which depends only 
on ordinal properties. It is shown how the existing theory of the coefficient may be used to 
estimate the population association, to set confidence limits for it, and also to test the 
difference in the coefficients calculated for two contingency tables. 

Since the proposed coefficient depends on rank order, a condition for its applicability is 
that the rows and columns of the contingency table fall into a natural order. This is, in fact, 
generally the case. From one point of view the proposed coefficient may therefore be 
regarded as availing itself of more of the information supplied by the contingency table than 
does the contingency coefficient, which is invariant under interchanges of rows or columns. 

The confidence interval and the significance test obtained by use of the proposed coefficient 
are conservative, in the sense that a lower bound is provided for the confidence coefficient, 
and an upper bound for the probability of wrongly rejecting the hypothesis tested. This is 
a consequence of the fact that only an upper bound is known for the sampling variance of the 
coefficient, and this is in general a poor upper bound, although sometimes attainable. 
However, a conservative test is better than no test at all, and this test should be useful in 
many practical situations, especially where sample sizes are large. 


2. CONTINGENCY TABLES AND RANKINGS 


Kendall (1949) has suggested that his rank correlation coefficient t may be calculated for 
a contingency table whose rows and columns are ordered by the criteria of classification. 
An r x table with a grand total of n is regarded as two rankings of n objects according to 
characteristics for one of which only r separate ranks are distinguished, and for the other 
only s separate ranks. Looked at in this way, the marginal totals are simply the numbers of 
objects tied at each level. There is thus a one-one correspondence between any contingency 
table and a pair of rankings. 











106 Strength of association in contingency tables 


The original context of this suggestion was the estimation of the product-moment 
correlation parameter in the case of non-normal variation, but there is no reason why the 
coefficient ¢ should not be used in its own right as a measure of association for contingency 
tables in which there is a natural order for rows and columns. 


3. THE DENOMINATOR OF ¢ 


The test commonly used to test the significance of an observed value of t, where there are 
no ties in the rankings, is described by Kendall (19486). Like the x? test for a contingency 
table, it is made with reference to a hypothetical population obtained by the permutation 
cf sample observations. In calculating the coefficient, we may therefore use Daniels’s (1944) 
definition of ¢ in the universe of sample permutations. If r;, r; are the ranks allotted to the 
ith and jth objects (i <j), and a,,, b;; are scores, one defined on each ranking, given by 


{+l (r;<71;), 
\ —1 (r; > r;), 
2a;;b;; 
- ae l 
Sal)! (2bRH a 


aij) by; = 
this definition is f= 


The summations in (1) extend over all unequal suffixes. The property of the coefficient 
—l<it<¢+l (2) 


is seen from (1) to be a consequence of the Cauchy inequality, the equalities in (2) being 
attainable only when the members of each set of scores are all +1 or all —1. When there 
are no ties, (1) is equivalent to the more normal definition 


28 
t= 5-5" (3) 


where S is a function of the number of inversions formed by the two rankings. 
When we come to consider rankings containing ties, we must further define 


4;;,6;,=0 (r, =75), 


and alternative forms are now possible for t, for although it remains true that La,,b,; = 28, 
the denominators of (1) and (3) no longer agree. If we are interested simply in testing whether 
association exists, the discussion of which form of denominator to use for ¢ is unnecessary ; 
the test may be carried out directly on the numerator. But if we are concerned to measure 
association, it is of some importance to choose a denominator so that the limits of the 
conventional range (— 1, + 1) are at least sometimes attainable. With this in mind, we may 
now examine the possibilities. 

(a) If, when there are ties present, we continue to use n(n — 1) as the denominator, as in 
(3), we obtain the coefficient denoted by Kendall (19485) as ¢,. 

(6) If, on the other hand, we continue to use (1), the coefficient is denoted by t,. The fact 
that some of the a,,, 6;; are zero implies that the denominator is decreased in the tied, as 
compared to the untied, case. Nevertheless, it follows from the Cauchy inequality that the 
equalities in (2) are not attainable by ¢, except in the trivial case when all the scores are 
zero. It further follows that since t, has the same numerator but a larger denominator than 
t,, and ¢, cannot attain + 1, ¢, certainly cannot attain +1. (This is analogous to the similar 
deficiency of the contingency coefficient.) Where the number of ties is large, the deficiency 





~~ a? ao 





A. STUART 107 


will be serious. Since we are particularly interested in contingency tables, which often have 
large marginal totals, we must consider whether a more appropriate form of denominator 
exists. 

(c) Consider the maximum (positive or negative) numerator La,,b;; which can be produced 
by a sample of m observations arranged in a rxs table. This will be attained when all 
observations lie in cells in a longest diagonal of the table, and the frequencies in these cells 
are equal, or as nearly equal as possible. Any move away from this position will decrease the 
total score La;;b;;. A longest diagonal contains m cells, where m is the lesser of r and s. 

If n is a multiple of m, the maximum sample score is 


La,,;b;; = o(=) 142... +(m—1} = n(™—). (4) 


m 





(If m = m, we are back to the untied case, and (4) reduces to n(n — 1), as it should.) Thus 
(4) is an attainable upper bound for 2a;,5;;. 

When 7 is not a multiple of m, (4) remains an upper bound, but cannot be attained. This 
is a quite unimportant deficiency in practice, when n is often large and m very small, and the 
residue of n to modulus m necessarily small. 














Xa,,b,; 28 
We therefore define t, = vn ‘ 5 
7 2 (m-—1) n2@—) (5) 
m m 
a (n—1) ™ 
whence (— a 1) fe (6) 


It follows that t, can sometimes attain, and for large n can generally almost attain, +1. 


4. THE SAMPLING PROPERTIES OF ¢, 


Remembering the correspondence between a pair of rankings and a contingency table, let 
us consider the cell frequencies of an r x s table, 


Sx (¢ = 1,...,7; & = 1,...,8), 


as having been sampled from a population r x s table with frequencies F;,. Then 
z 
t= nlta—l) Las bulaln 


1 
and E(t.) = —— Lay by E (fafa), 
n(n—1) 


which, by the ordinary formula for sampling from a finite multinomial population, becomes 


n(n — 1) 
~ n(n— 1) ————-~ Slag bya N(N- 1) Fy Fy 


1 
_ NW a1 29 OF En = Ta (7) 


where 7, is defined for the population exactly as is ¢, for the sample. Thus ¢, is unbiased 
for 7,. 











108 Strength of association in contingency tables 


Now Hoeffding (1948) has shown that for large NV, the sampling distribution of t, tends 
to the normal form as n increases, and its sampling variance obeys the inequality, first given 
by Daniels & Kendall (1947) for the untied case, 


vart, <= (1-78). (8) 
If we now define, analogously to (6), 
(V-—1) m 
N (m—1)'” 





= 
and use (6), (7) and (8), we find 


(n—1) N 
n (N—-1)” 


vart.<= {| ™ 4 y- ay a J} 


which for large N and n reduce to 


E(t.) = T,; 


var wo {("4) -74. @) 


As usual, we may estimate 7, in the sampling variance by t.. 

The net effect of using t, is therefore to multiply the value of our measure of association 
by m/(m—1) approximately, and increase its sampling variance appropriately, while its 
property of unbiased estimation of its population correspondent 7 becomes asymptotic. 

(9) may be used to estimate 7,, and to set conservative.confidence limits for it. Similarly, 
a conservative test for the significance of the difference between two observed values of t, 
follows immediately. Conservative procedures are forced upon us by the fact that only an 
upper bound to the sampling variance of the coefficient can be obtained. Daniels (1950) has 
shown that this upper bound, though sometimes attainable, is in general a poor one. Never- 
theless, when n is at all large, even these conservative limits are close enough together for 
many practical purposes. 


E(t,) = 








5. THE CALCULATION OF S 


Arrange the columns of the table so that they are ordered from left to right, and the rows 
so that they are ordered in the same sense from top to bottom. That is to say, our ‘origin’ is 
at the top left corner of the table. 

For each cell in the table, multiply its frequency: 

(i) positively, by the frequencies of all cells lying below it and to its right; 

(ii) negatively, by the frequencies of all cells lying below it and to its left; 

ignore all cells lying above it, or in the same row, or in the same column below it. 

S is the sum obtained after applying this process to every cell in the table. Since the cells 

in the last row have none below them, their score is always zero. 


6. EXAMPLE 


Tables 1 and 2, based on case records of the eye-testing of employees in Royal Ordnance 
factories in 1943-6, have been constructed from data very kindly made available by the 
Association of Optical Practitioners. 








A. STUART 109 


Is First, we calculate the ¢, association coefficient for each table. Using the method described 
n in $5, we find that for men (Table 1), S = + 2,480,223, so that 
28 2 x 2,480,223 
f= e=) "axe TO 
3) i 
m 


For women (Table 2), similarly, S = + 13,264,256, so that 


_ 2x 13,264,256 


“Tews ~ (7477)? x 3 = + 0-633. 


Table 1. 3242 men aged 30-39: unaided distance vision 








5 Left eye | Highest Second Third Lowest Total 
Right eye grade grade grade grade 
Highest grade 821 112 85 35 1053 
)) Second grade 116 494 145 27 782 
Third grade 72 151 583 87 893 
Lowest grade 43 34 106 331 514 
n Total 1052 791 919 480 3242 
[Ss 
ys Table 2. 7477 women aged 30-39; unaided distance vision 
t, 
" Left eye | Highest Second Third Lowest Total 
LS Right eye ais Se grade grade grade grade 
r- 
al Highest grade 1520 266 124 66 1976 
Second grade 234 1512 432 78 2256 
Third grade 117 362 1772 205 2456 
Lowest grade 36 82 179 492 789 
v8 
‘3 Total 1907 2222 2507 841 7477 
These values are very close together, and in the ordinary course of events we would not, 
perhaps, bother to test their difference. For illustrative purposes, however, we may carry 
through the calculations. 
is Our estimate of the upper bound to the sampling variance of ¢, for men is, by (9), 
35a5 {(4)? — (0-629)*} = 0-000853, 
and for women, similarly, it is 
ce 9 . 
ne ——~ {($)? — (0-633)?} = 0-000368. 
7477 {($)? —( *} 




















































































110 Strength of association in contingency tables 


The values of the coefficient, and their difference, with the estimates of the appropriate 
maximum standard errors, and the conservative confidence limits based on two standard 
errors, are therefore as follows: 








Maximum 2 s.E. conservative 
te standard error confidence limits 
Men + 0-629 0-029 0-571 to 0-687 
Women + 0°633 0-019 0-595 to 0-671 
Difference 0-004 0-035 — 




















It follows, as we anticipated, that the difference in strength of association has not been 
shown to be significant. 


REFERENCES 


Daniets, H. E. (1944). Biometrika, 33, 129. 

DantEts, H. E. & Kenpait, M. G. (1947). Biometrika, 34, 197. 

Dantets, H. E. (1950). J.R. Statist. Soc. B, 12, 171. 

Hoerrpine, W. (1948). Ann. Math. Statist. 19, 293. 

KENDALL, M. G. (1948a). The Advanced Theory of Statistics, vol. 1, 4th ed. 
London: Charles Griffin. 

KENDALL, M. G. (19486). Rank Correlation Methods. London: Charles Griffin. 

KENDALL, M. G. (1949). Biometrika, 36, 177. 

Witurams, E. J. (1952). Biometrika, 39, 274. 

YarTEs, F, (1948). Biometrika, 35, 176. 








{ 111 ] 


A SEQUENTIAL TEST FOR RANDOMNESS 


By P. G. MOORE 
University College, London 


1. The problem frequently arises of deciding whether a sequence of observations, each 
observation falling into one of two alternative categories or types, occurs in a random order. 
David (1947) has considered the ‘group’ test in some detail. The hypothesis, Hy, that we 
are considering here states that there is randomness within the sequence against an 
alternative, H,, that there is dependence of the kind found in a simple Markoff chain. 
David’s test is based on the number of groups of a common type which the observations 
form when placed in order in the sequence. 

In this paper we are going to deal with another form of this procedure which leads to 
a sequential test. This has obvious advantages in that the amount of data which must be 
collected in order to obtain a decision is not fixed beforehand. It may be that the first few 
values will suffice to make a decision one way or the other and only in cases which lie in 
between will a long sequence of observations be necessary. This form of procedure may well 
mean a large saving of time, money and materials. 

2. In the sequence of alternatives we write 1 for the happening of a certain event and 0 
forits negation. Let Z, represent the sth event, which may either be a | or a 0. The theoretical 
model envisaged is that of a simple Markoff chain. The general principles of such a chain 
have been discussed at some length by Fréchet (1938) and the notation that we will use is that 

Pr{z, = 1} =P, Pr{z, = 0} = Q, 

Pr {Z, = 1| E,_, =l}=7,, Pr {E, = 0| £1 ==, 

Pr{H, = 1|H,,=0}=p,, Pr{H, = 0| #,, = 0} =q, 
where P+Q=1, Mt+h=1, PetQe=l. 
These equations define the general set-up. The hypothesis of randomness is a special case 
of this Markoff chain when the probabilities of the two possible results of a trial are the same 
whatever the result of the previous trial. This means that p, = p, = p (say) and hence, of 
course, 9; = qq. 

It is necessary to make some assumption concerning the start of the sequence. If we 
assume that the sequence’s start is at a randomly selected point in a longer sequence which 
also obeys the same probability model as has been given, then 

Pr {H#, = 1} = Pr{#, = 1, Z, = 13+ Pr{Z, = 1, Z, = 9}, 
where FZ, is the event chosen to start the sequence and £, is the previous event. Hence 


P= p,P +p2Q, 
or P(1\—p;)=P2@ giving p,= Pq,/Q. 
We will denote the hypothesis of randomness (specifying a particular value of p) by H, 
and a Markoff chain set-up as defined by the equations given earlier as H,. Then it is known 
(David, 1947, p. 335) that, under the assumptions made above, if there be an even number 


of groups (2¢), then Pr {t| 173,17, Ay} = kb" 1C,_,"C,_,, a) 
i t 
Pr {t| 71,72, A} = k’ nig rG,_.(Be2) ' (2) 
1 











112 A sequential test for randomness 


where r, and r, are the numbers of 1’s and 0’s respectively and k and k’ are constants such 
that & Pr {2¢} = 1, where = denotes summation over all possible values of t. 

For the case of an odd number of groups the expressions are more complicated. If there 
are 2t+ 1 groups, then 


Pr {t | "> Te; A} a lin—C, 101 + ts OFT "—1C)) 
t 
Pr {t | 14,%, H,} =U (E%) [ ts OF te OF CG 1G | " 
Pid 1 qe 


t and /’ being constants such that the sum of the probabilities over all possible values of ¢ is 
unity. Moore (1949) has shown that for ¢ large the distributions of Pr {t} under either H, or 
H, approximate to one another. That is to say, that whether there be an odd or even number 
of groups, the distribution of Pr {t| H} is approximately the same. Hence for the purposes 
of this paper attention will be focused on the case of an even number of groups, bearing in 
mind that the results, at any rate for a large number of groups, apply equally well to the 
case of an odd number of groups. 

It is necessary to find the values of k and k’ for use in equations (1) and (2). This is most 
easily done by considering the case, t = 1, and finding the exact probability under H, and 
H, by simple enumeration. We find that 


k = 2prg’, (3) 
P P 
v= ae[ +2]. (4) 
2 Nh 
3. To test a hypothesis H, against a hypothesis H,, with a risk « of rejecting H, when 
it is true and of rejecting H, when it is true, the sequential test procedure consists of 


calculating at each stage of the sampling the likelihood ratio L. 
If 


—" (5) 


we accept H,, whereas if Lé< rset (6) 
we accept H,, andif L lies in between the values in (5) and (6) we continue to take observations. 
The likelihood ratio in this case is 


_ Prit|ry,72, Ai} 
~ Pr{t|r,, 179, Ho} 


nal 
« «engl (esis) 


2prq"s qoP3) 7) 


This is a perfectly general expression and valid for all values of the variables. We now 
propose to consider the particular case where under H,, p = q = 0-5. Under H, we will take 
P =Q = 0-5 and p, equal to some particular value other than 0-5. It will be noticed that 
this simplification has practical application in the consideration of runs above and below 
the median value in a sequence. It is important that the value used as median should be 
near to the true value; if it is not, the test instead of picking out a departure from randomness 
may only be drawing atténtion to a large difference in the proportion of 1’s and 0’s. From 














re 








P. G. Moorz 113 


§2 we have that p, = q, because P = Q and hence p, = q, since p, +q, = 1. The expression 
for L may now be re-written under these simplifying assumptions as 


ibe La) 


2(4)” 
= 2R-ipf agi 
= (2p,)?-*(2 — 2p,)*, (8) 


where R = 7, +79. 
Thus LZ, in this particular case, is seen to be independent of the actual values of r, and 
r, and only dependent on the total length of the sequence. 


4. Under the sequential system observations are taken for so long as L satisfies the 
inequality Bl(l—a) <L<(1—A)Ja. 


Expressing this inequality in terms of logarithms and substituting for L the expression 
obtained in (8), we get that observations are to be taken for so long as 


log F< (R — 2t) log 2p, + (2¢- 1) log (2 2p,) <log*—*, 


which in turn may be written as 


log {Fe 2p,)| < Rlog 2p, + 2¢log ~—P1 ms Pr <log [=F (2- 2p,)}. (9) 


The test procedure may now be carried out graphically. The horizontal axis of the graph 
corresponds to R and the vertical axis to t. Then if we draw the relationship given in (9), 
replacing the inequality signs by equality signs in turn, we obtain two parallel straight lines. 
Then, provided the point (R,t) falls inside these two lines, a further observation is to be 
taken. If the point falls below the lower line then we accept the alternative hypothesis, 
H,, if p, is greater than 0-5 or the null hypothesis, Hp, if p, is less than 0-5. Similarly if the 
point falls above the upper line we accept H, if p, is greater than 0-5 and A, if p, is less 
than 0-5. 


5. The whole procedure is best illustrated by means of an example. The data are taken 
from Brunt (1925) and consist of the annual rainfall in inches at London for each of the 
years 1813-1912. To carry out the test we have described it is necessary to decide how to 
make the division of the rainfalls into two groups. After 16 years the median rainfall, that 
is, the average of the eighth and ninth rainfalls when the 16 are ranked in order, is 23-87 in. 
In Table 1 each year is given the symbol | if the rainfall is above this median value and the 
symbol 0 if it is below the median value. Only the first 45 years are given in Table 1 for 
economy of space. 

In this example we will test the hypothesis of randomness (p = q = 0-5) against the 
alternative that p, < 0-5, taking as the particular alternative required to define the pro- 
cedure, p, = 0-3. Thus we are testing whether wet years tend to be followed by dry years 
or not. Both « and f will be taken as 0-05, and hence our two equations for the critical lines 


Rlog (0-6) + 2¢]og () ='log (5) (1-4), 
Rlog (0-6) + 2t log () = log (19) (1-4). 
Biometrika 40 8 


are 











114 A sequential test for randomness 

Evaluating these equations numerically we obtain the two parallel lines 
R—-—3-3174t— 5-1054 = 0, (10) : 
R—3-3174t + 6-4228 = 0. (11) , 


The lines given by (10) and (11) have been plotted in Fig. 1 and the figure completed by 
putting in the observed points (R,t) for R greater than 16. Note that the first sixteen 0 


Table 1. Rainfall at London in inches 











Year Rainfall Year Rainfall Year Rainfall 

1813 23°56 0 1828 27-88 1 1843 25°85 1 
1814 26-07 1 1829 25-32 1 1844 22-65 0 
1815 21:86 0 1830 25:08 1 1845 22-75 0 
1816 31:24 1 1831 27-76 1 1846 26:36 1 
1817 23-65 0 1832 19:82 0 1847 17:70 0 
1818 23°88 1 1833 24:78 1 1848 29-81 1 
1819 26-41 1 1834 20-12 0 1849 22:93 0 
1820 22-67 0 1835 24-34 1 1850 19:22 0 
1821 31-69 1 1836 27-42 1 1851 20-63 0 
1822 23-86 0 1837 19-44 0 1852 35-34 1 
1823 24-11 1 1838 21:63 0 1853 25:89 1 
1824 32-43 1 1839 27-49 1 1854 18-65 0 
1825 23-26 0 1840 19-43 0 1855 23-06 O 
1826 22-57 0 1841 31:13 1 1856 22:21 0 
1827 23-00 0 1842 23:09 0 1857 22-18 0 






































T T | ! | q t | | | | | | | q y 
12 
10 yon 
2 ' Continue 
° sampling 
J 
= 8 
> 
6 
Accept Ho 
4 
i i —— i i | | | i i - | AY ! l 1 
16 18 20 22 24 26 28 30 32 34 36 








Value of R 
Fig. 1. Sequential scheme for group test Hy: p=q=0-5; H,: P=Q=0-5, p, =0°3. 


observations contain six groups of 0’s and six groups of 1’s, so that R = 16, 2t = 12. The 
chart therefore starts with a point having co-ordinates (16, 6). We see that the point remains 
inside the lines until R is equal to 36 when ¢ = 13 and it goes outside the upper line. Thus at 
this stage we would cease taking observations and accept the hypothesis that wet years , 
tend to be followed by dry years and vice versa. 











he 
ns 
at 
rs 





P. G. Moore 115 


We have already noted that since H, states p = q = 0-5, it is essential that as the sequence 
progresses a check is made that we have about equal numbers of 1’s and 0’s. This is to 
prevent the discriminating power of the test being blunted by a wrong choice of the median. 
In this particular case we find that after thirty-two observations, for instance, there are 
about equal numbers, in fact seventeen 1’s and fifteen 0’s, so that there is no reason to 
question the choice of critical value taken for the median. Had there been a very wide 
discrepancy it would have been best to have obtained a new median value and recalculated 
the value of t. The critical lines would remain, of course, the same as before. 

In the procedure just carried out we tested the hypothesis of randomness against the 
hypothesis that p, = 0-3, meaning that wet years tend to follow dry years or that a negative 
correlation exists between successive years. A more general problem would be to test the 
hypothesis of randomness against a Markoff chain that was either positive or negative in its 
correlation between successive events. This means that there would be three alternatives, 
(a) positive correlation, (b) randomness and (c) negative correlation between successive 
events. In this case the problem could be tackled by setting up two simple sequential 
schemes. The first would be the same as that already described, and the second scheme would 
be set up with p, = 0-7 (say). If at any stage in the sampling the first scheme told us to 
accept the hypothesis that p, = 0-3 we would do so, and similarly if the second scheme told 
us to accept p, = 0-7 we would do so. However, we would only accept the hypothesis of 
randomness if both schemes told us to do so. If none of these three results has occurred we 
continue sampling. Both sets of sequential limits may be drawn on the one diagram, which 
makes the whole procedure relatively simple to carry out. An example of having two 
sequential schemes superimposed on the one figure has been given by Armitage (1947). 
He was concerned with testing whether the mean of a sampled population was equal to 
some specified value against the alternative that it was either larger or smaller than the 
specified value. 


6. It is useful to have a knowledge of the expected sample size associated with any 
sequential test before starting the test procedure. Unfortunately, as we are discussing a case 
where the observations are not independent of one another the usual formulae for estimating 
the average sample size are not applicable. It is, however, true to say that the sequential 
procedure will lead to a saving of observations in many cases, since frequently decisions 
are made on fewer observations than with the fixed sample size of the usual methods of 
testing. 

The test that has been derived and illustrated here has been concerned with the case of 
p =q= 0-5. It is clear from the derivation in §3 that the procedure could be used for 
p unequal to g by suitable adjustments. It would also be possible to base the test on a 
Markoff chain principle under which the observation depends not just on the immediately 
preceding observation but on the two preceding observations. This case will, it is hoped, be 
investigated in a later paper. 


REFERENCES 


ARMITAGE, P. (1947). J. R. Statist. Soc. B, 9, 250. 

Brunt, D. (1925). Phil. Trans. A, 225, 247. 

Davin, F. N. (1947). Biometrika, 34, 335. 

Fritcuet, M. (1938). Recherches théoriques modernes sur le calcul des probabilités, Book 2. 
Moors, P. G. (1949). Biometrika, 36, 305. 


8-2 








[ 116 ] 


ON THE MEAN SUCCESSIVE DIFFERENCE AND ITS RATIO 
TO THE ROOT MEAN SQUARE 


By A. R. KAMAT 
University College, London 


1. INTRODUCTION 


Let x, (¢ = 1, 2,...,m) denote a sequence of n normal variates with means ; and a common 

variance o*. If all means y, are identical, i.e. if 4; = w, then the best estimator of o? is of 
n — 7)2 

course 8? = > 6-5 
i=1 

continuous trend, and for such situations two alternative estimators of o have been con- 

sidered. They are: the mean-square successive difference 


. Situations, however, arise in which 7; may have a ‘slow-moving’ 


mo} (a, — 2441)" 


§2 = 


in Onl ’ 
and the mean successive difference 
n—1 
t,—2 
d=3 | t;—2;4,1| 


Von Neumann, Kent, Bellinson & Hart (1941) have given a brief historical review of these 
statistics and their use in ballistics and in astronomy, but their results deal only with the 
statistic 67. As to results about d, Arley & Hald (1950) and Guest (1951) have derived the 
variance of d and have discussed its efficiency.* Quite recently, also, the use of d in quality 
control chart technique has been discussed by Keen & Page (1953). 

Using the results on the absolute moments of the multivariate normal distribution 
obtained by Nabeya (1951, 1952), and also derived for this purpose by Kamat (1953a), the 
first four moments of d can now be evaluated. In the present paper we propose to discuss 
the approximate distributions of the mean successive difference, d, and its ratio to the root 
mean square, viz. d/s, based on their first four moments. 


2. THE DISTRIBUTION OF THE MEAN SUCCESSIVE DIFFERENCE 
2-1. Preliminary results 


Let us assume that yw; = so that x,,2%,,...,2, is a sample from a normal population with 
constant mean yw and standard deviation o. We shall use the following notation: 


6, = 4 -%y,, 4 =|6;| = |e, —24,| (§ = 1,2,...,.—-1),) 
n-1 n-1 
D=y4;,=% | x;—2441|- 
i=1 i=1 
Then the mean successive difference is 


D oS 4 St Le Fe | 
n—1 ij=1n-1 i=1 n—1 } 





* I understand that similar results (unpublished) were independently obtained by E. C. Fieller & 
H. O. Hartley. 





ee a. | 


ese 
the 
the 
lity 
ion 
the 


SS 
oot 


rith 


(1) 





A. R. Kamat 117 
It is clear that 6; are normally distributed with zero mean and that 


o(d;) = 20, p(S;, 8:45) =-} (p=1) 


= 0 (p> 1). ) 


To find the first four moments of D (and therefore of d), it is necessary to evaluate the 
following expectations: 


&(d,), 
S(d), &(d,d,), a 
E(dj), S(djd,), &(d,d,d;), 


& (di), & (did), é (did}), é(djd,ds), &(d,d5ds), &(d,d,d3d,). 


These are either the ordinary moments or the absolute moments of the multivariate normal 
distribution of 6,,6,, 63, 6,. All of them except the last, viz. &(d,d,d,d,),can be evaluated from 
the results given by Nabeya (1951, 1952) and Kamat (1953a) for the bivariate and the 
trivariate absolute moments, and are as follows: 


6d) = <0, 
E(d3) = 20%, &(d,d,) = es +5) o?, 
E(d8) = So", E(d2d,) = = o3, &(dyded,) = 4 (v2 + ? ~3c08-? =) Ps (4) 


& (dt) = 1204, (did) = (2+ 22°) of &(d2d3) = 604, 


(dda) = (3+ °°) of, (data) = Zot 





4 


As regards &(d,d,d,d,), two methods were used to find its value. The first is to use the 
integral form given by Nabeya* (1952). The actual evaluation from his formula of this 
absolute moment, for the four-variate case with a general correlation pattern, involves 
considerable difficulty. Even in our particular case, where three of the six correlations are 
zero and the remaining three are equal, the evaluation is quite elaborate. Using this method, 


_ 44 ff ae , dt, dtydt,dt, 
6 (d,d,d,d,) = AAT St, Oty Oty dt, Oty it, (exp [— $20} —2p,;t; ‘| ~ thtelgt, tetst, (5) 


where i,j = 1,2,3,4; 1+7 and py. = Pos = Pag = — 4, Pig = Pog = Pry = 0. Carrying out the 
differentiations in (5), the expression can be reduced to a sum of nineteen integrals of the type 


{fffa ey tg tf exp [—42t}— Xp,jt,t,]dt,dtadtgdt,, (6) 


* We are indebted to Mr S. Nabeya for sending us the manuscript of his paper (1952) which contains 
this formula in advance of publication. 











118 The mean successive difference 


where —1<l,m,n,p< +2 andl+m+n+p = 0, —2, —4. Eighteen of these integrals can 
be evaluated analytically. For instance, 


i] j | | t71#8t; exp [—}203—Lp,,t,t,]dt,dt,dtydt, 


= [) dp24ps, Ee Spal ffoet- 42b;t7 — Xp,;t;t;) dtydtydtyat |, De a 


where 6b, =6,;=by=1, Piz = Pos = Pug = 9- 


This is further simplified as 


~4 
o2 
ent | | dpredes ES (4-4 iss a where A = (b,—pi2) (1 —p%,) — Pis 
6 2°23 2=1, Pas=— 


= 67? {fa — Pia) (3 — Pa — Pia + P3aPi2)* WP r2d Pa, 
0 


" an*(Y (5+ §,/Stan-1 7, a3): (7) 
The integral (6) with / = m = n = p = —1, however, leads to 


r= | sin (/(= i) Ja zy’ 


which had to be evaluated by numerical quadrature, and its value was found to be 0-328988. 
Substituting in (5) the values of all these nineteen integrals, it reduces to 





6 (d, 4,434) = rt (5+ 4sin- a sin-!,/f—}sin™ 76+ 4| 
= 1604(1-366622)/n? = (2-215484) 04, (8) 


correct to at least five places of decimals and, possibly, to six places. 
This result was checked by finding &(d,d,d,d,) using the expansion in series of powers 
of correlation coefficients given by the present author (1953a, pp. 28-30), viz. 


1604 . 
6(d,d,d3d,) = ey {1+ $2 pi; + LpisPiPjx + TeUPty — FEPi; Pix 
+ h2pijPiut ZpyPuPinPat---}, (9) 


where i, j, k, 1 are all different and assume values 1, 2, 3, 4. For instance, expanding this 
series up to the tenth power of correlation coefficients and substituting p,. = Pos = Ps, = — 4, 
Pis = Po = Pra = 9, we get 


6(d,d,d,d,) = 1604(1-366574)/m? = (2-215406) o4, (10) 


which coincides with the result above to first four significant figures. 





A. R. Kamat 119 
can 2-2. Moments of d 


Denoting n—1 by m, 

&(D) = m&(d,), 

&(D*) = m&(d3) + 2(m—1) &(d,d,) + (m—1) (m—2) &(d,), 

& (D3) = m&(d3) + 6(m — 1) &(d?.d,) + 3(m — 1) (m — 2) &(d2) &(d,) + 6(m — 2) &(d, dds) 
+ 6(m — 2) (m —3) &(d,d,) &(d,) + (m —2) (m — 3) (m— 4) &%(d,), 

&(D*) = m&(d$) + 8(m— 1) &(d3d,) + 4(m— 1) (m— 2) &(d3) &(d,) + 6(m — 1) &(d3.d3) 
+ 3(m — 1) (m— 2) &2(d2) + 24(m — 2) &(d2dydy) + 12(m—2) &(d, dds) (11) 
+ 24(m — 2) (m— 3) &(d2d,) &(d,) + 12(m — 2) (m— 3) &(d,d,) &(d?) 
+ 6(m — 2) (m— 3) (m — 4) &(d2) 62(d,) + 24(m — 3) &(d,d.d3d,) 
+ 24(m— 3) (m — 4) &(d,d_dg) &(d,) + 12(m—3) (m—4)6%(d,d,) 
+ 12(m—3) (m—4) (m—5) &(d,d,) €%(d,) 
+ (m—3) (m—4) (m—5) (m—6) &4(d,). 


=-+ 





Substituting the values of the various terms, the results (11) lead us to the following central 
moments of d: 


: 














(7) A= ya” = 1-1283790¢, 
- {(8 eat). : wea) 2 
fa=| ~-s m \3+~ aw) m7 
= {1-052264m-! — 0-325504m-*} o?, 
1 1 
OA Q— oe = a Ped 
Hs = {(1e0+ 160 + 24/2 —96./3— 72 cos a) as 
988. ' 1 1 
- (420+ 192 + 48 ,/2—144,/3— 144c08-t 5) =ni\™ 
= {1-642670m-? — 0-870598m-3} 0°, , (12) 
8 4,/3—12\? 1 l 
= {3{- —| — 1 —1— _ 60,/2+21 - 
si Ma {35+ “ ) mt [19( 80 cos 3 (2+ 210,/3 309) 
167? 1 
eS a 8 2 
8(20./3+74)——3— + 24m | = 
wers 1 
+ [ 16(144 (2 + 468 — 432 cos“ B — 360 13) + 87(21,/3 + 256) 
1 
+47? — 72am | sail where a = &(d,d,d,d,)/o* = 2-215484 
(9) = {3(1-052264)? m-2 + 1-119816m-* — 2-040188m-4} o4, J 
this 
f, 2-3. Approximate distribution of d 
It is clear from (12) that 2,0 and £,->3 as m (and hence n) tends to infinity. Table 1 
(10) gives the values of o(d), 2, and £, for n = 3(1) 10, 15, 20, 25, 30, 40, 50. The £,, 8, points lie 


in the Pearson Type I region, just above the Type III line. If we may assume that a Pearson 
Type I curve with the correct first four moments will represent the distribution of d ade- 











120 


quately, it is relatively easy to obtain approximations to certain percentage points. The 
upper and lower 5 and 0-5 % points for the appropriate £,, 8, values were found by inter- 
polating in Pearson & Merrington’s (1951) tables of standardized percentage points. For 
nm = 4(1)8, the 2-5 and 1% points were obtained by interpolation in Thompson’s (1941 a) 
tables of percentage points of the Incomplete Beta-function, first finding the appropriate 
p = 4», and q = 47, from the standard relations connecting the Type I parameters with the 
A,, 8, constants of the distribution. Finally, for n> 8 it was found that a Type III curve 
with the correct first three moments was indistinguishable for our purpose from the Type I, 
and the 2-5 and 1% points were therefore found with the help of the tables of percentage 


The mean successive difference 


points of the x?-distribution (Thompson, 19416). 





Table 1. Standard deviation, 2, and B, values for dlo = ¥; | 2,—244;|/{(n—1) 0} 
i=1 
































n S.D. Ay Bs n S.D. A, Be 
3 0-6669 1-0356 4-2962 15 0-2711 0-1638 3-2028 
4 +5609 0-7253 3°8940 20 +2334 -1210 3-1499 
5 -4927 +5547 3°6857 25 +2080 -0959 3°1188 
6 -4443 +4484 3°5548 30 +1895 0795 30984 
7 0-4078 0-3760 3°4655 40 0-1636 0-0592 3-0733 
8 *3791 +3237 3-4008 50 -1461 0471 3-0584 
9 *3556 +2841 3°3518 

10 -3360 +2531 3-3135 





Note. Mean d/o = 1-128379. The fourth significant figure in #, and the fifth significant figure in /, 


are not reliabie 


Table 2. Percentage points for the approximate distribution of dla 






































Lower Upper 
n 0-5 1 2-5 5 5 2-5 1 0-5 
3* 0-09 0-12 0-19 0°27 2-41 2-75 3-16 3°43 
4 “17 22 +29 37 2-18 2-44 2-76 2-99 
5 23 28 36 44 2-03 2-26 2-53 2-72 
6 28 +33 41 49 1-94 2-14 2-37 2-53 
7 0-33 0-37 0-45 0-54 1-86 2-04 2-26 2-40 
8 37 41 -49 *57 1-81 1-97 2-17 2-30 
9 -40 “44 +53 -60 1-76 1-91 2-09 2-21 
10 43 “47 55 63 1-73 1-86 2-03 2-14 
15 0-54 0-58 0-65 0-72 1-60 1-71 1-84 1-93 
20 61 +65 71 “17 1-53 1-62 1-73 1-80 
25 66 -69 75 81 1-49 1-57 1-66 1-72 
30 69 “13 78 *83 1-45 1-52 1-61 1-66 
40 0-75 0-78 0-83 0-87 1-41 1-47 1-54 1-59 
50 ‘78 81 86 -90 1:38 1-43 1-49 1-53 





* For n=3 all percentage points are exact. 














Ay 








A. R. Kamat 12] 


For n = 3 all percentage points were calculated from the exact distribution given in 
§2-4 below. All these results are combined in Table 2. -Certain tests of the accuracy of 
the approximations are given in the following sections. 

2-4. Comparison in the special case n = 3 


No direct comparison between exact results and the Pearson Type approximation is 
possible except in the rather exceptional case n = 3. Here, by linear transformation, the 
probability integral can be expressed in terms of the bivariate normal integral and can be 
shown to be (see Kamat, 1953a, p. 31). 


P{dlo < dg} = 1+ 2{p_4(/2 do, 0) — py(f2 do, 0) + p_ yal (§) do, 0) — DyalV(F) do, 9}, (13) 
where p,(h, k) = (277) (1 —p?) + rs exp [- mI (a? + y? — 2pcy) | dxdy. 


This function is tabulated in Karl Pearson’s (1931) T'ables for Statisticians and Biometricians, 

Part II. The following comparison for the upper and lower percentage points for 
D=(n—1)d = 2d 

shows that the Pearson Type I fit is adequate for practical purposes*. 











Lower Upper 

1 2-5 5 5 2-5 1 
Integral (13) 0-24 0-38 0-53 4-82 5°49 6-31 
Pearson fit 0-26 0-40 0-55 4-81 5-47 6-29 





























2-5. Sampling experiment 
An extensive sampling experiment was carried out with the help of Hollerith punched- 
card equipment. 25,000 Hollerith cards, each bearing a random normal deviate from 
Wold’s (1948) tables of random normal deviates, were available. These cards were sorted 
into 2500 random samples of 10, X;,; (i = 1,2,...,10, 7 = 1,2, ...,2500). On a Hollerith 
Senior Rolling Tabulator the following tabulation was then carried out. From each of the 
2500 samples of 10 the ten progressive sums 


10 
| X—Xja|, | Xj — Xjq|+| Xj2—Xja|, ooeg 2 | X— Xie | 


were formed and printed (taking X, ,, of the first sample as X;,, , of the next). 
-1 
For any sample size n< 10, ie. n—1<9, the particular sum >. | X;,—Xj,,, | provides 
i=l 


an independent sample value of D, so that for each of the sample sizes 2,3, ..., 10 (i.e. 
n—1=1,2,...,9), 2500 independent values of the successive difference sum, D, were 
available. The same tabulation can also provide 1250 independent sampling values of D for 


* Calculations made since this table was sent to Press show equally satisfactory agreement at the 
lower and upper 0-5 % points. 











122 The mean successive difference 


sample size n with 10<n< 20, by adding the contributions from the consecutive samples 
(j and j + 1) as follows: 


10 n—-10 
D= 2 Xn Xin |+ 2 |X541,¢— Xjer,c41|- 


The resulting distributions for n = 3, 4, 6, 8 and 10 were then compared with the approxi- 
mations based on the Pearson curves derived in the preceding section. The distributions 
of D were obtained by grouping the individual values of D listed on the Hollerith tabulator 
into frequency groups of suitable breadth (0-10 for n = 3, 0-20 for m = 4, and 0-50 for 
n = 6, 8,10). Thus for n = 4, 19 of the 2500 values of D were found to be less than 0-61 and 
49 less than 0-81. Inverse interpolation gives an empirical estimate of 0-639 for the lower 
1% point for D, i.e. of 0-21 for the 1% point for d = 4D. Table 3 compares the empirical 
percentage points derived in this way from the sampling investigation with the Pearson- 
curve values taken from Table 2. On this form of comparison, it is not possible to apply 
a test of goodness of fit, but it appears that the agreement is satisfactory. 


Table 3. Comparison of Pearson Type I approximation with empirical sampling distributions 














Lower Upper 

n 0-5 1 2-5 5 5 2°5 1 0-5 
3 0-09 0-12 0-19 0-27 2-41 2-75 3-16 3-43 
0-08 0-11 0-17 0-26 2-39 2-66 2-97 3-47 

4 0-17 0-22 0-29 0-37 2-18 2-44 2-76 2-99 
0-17 0-21 0-28 0-36 2-14 2-40 2-71 2-95 

6 0-28 0-33 0-41 0-49 1-94 2-14 2-37 2-53 
0-27 0-32 0-41 0-49 1-90 2-14 2°35 2-49 

8 0°37 0-41 0-49 0-57 1-81 1-97 2:17 2-30 
0-37 0-42 0-50 0-58 1-82 1-97 2°16 2-29 

10 0-43 0-47 0-55 0-63 1-73 1-86 2-03 2-14 
0-41 0-47 0-55 0-62 1-71 1-86 1-99 2-17 



































Note. The figures in bold type represent the theoretical values; the lower figures are those obtained 
from the sampling distribution. 
Two further comparisons were made: 

(i) As a check on sampling, the expected frequencies for D = 2d were obtained for the 
exact distribution in the case n = 3 by using the integral (13) for three intervals at either 
tail of the distribution. The following table sets out the compa: ‘son: 

Interval* (0:00-0:20) (0:21-0:40) (0-41-0-60) (0-61-5-00) (5:01-5:40) (5-41-5°80) (5-81-00) 
Observed 19 64 77 2246 34 24 36 
Expected 17-6 541 86-7 2236-9 35-9 23-7 45°1 


* The unit of Wold’s random normal deviates, and therefore for values of D, is 0-01. 
x? for six degrees of freedom is 4-99 and is not significant. The agreement is good and 


therefore gives us confidence in using sampling results for n > 3 to check the adequacy of 
Pearson curves for evaluating approximate probability points, 





les 








A. R. Kamat 123 


(ii) A detailed comparison was also made for the sample size n = 6 between the Pearson 
Type I fit and the empirical distribution for D obtained from the sampling experiment. 
This comparison is given below: 

















Frequencies Frequencies 
Interval for D Interval for D 
Expected Observed Expected | Observed 

0-01-—1-00 2-9 1 } 9-01-10-00 96-8 92 
1-01-2-00 51-1 55 10-01-11-00 52-3 44 
2-01-3-00 196-8 188 11-01-12-00 26-6 29 
3-01-—4-00 365-0 377 12-01—13-00 12-9 12 
4-01-5-00 456-8 440 13-01—14-00 5-4 3 
5-01-6-00 444-4 423 14-01-15-00 2-5 1 
6-01—7-00 362-1 363 15-01-16-00 0-8 1 
7-01-8-00 257-9 284 16-01- 0-4 3 
8-01-9-00 165-3 . 184 

Total 2500-0 2500 


























Grouping the frequencies, as indicated at the tails, y? for 12 degrees of freedom is 9-23, 
which is not significant. The agreement is satisfactory. 


2-6. Bias with a slow-moving shift in the mean 


1 | a —241| "(i — e41)* : 

As mentioned in the introduction, d = Er art ag and 6?= > ——- likely 

i=1 Be i=l ea 

» ‘ * (%,—%)? 

to provide better estimates of 7, than s? = > on ie 
i=1 

undergoing a slow-moving shift. To illustrate this point let 6; be the mean of the population 

when the observation x; is taken and let A0,/o = (0;,,—0,)/o0 be small, so that to the 

required accuracy its powers beyond the second may be neglected. Then with the help of 


the formulae we have developed elsewhere (1953), it can be shown that 


, when the mean of the population is 











_ Qo =(A0,)? 
(1) 6) = Flea (14) 
fs 1 (4, 23-6 1 (1 2/3—4 
mete oP tego SS 
1 


[Fang (A092 59 (00) + (00,2) 





TT? 
— ga 2(A8,) (A0..2)|}, (15) 
(2) &(8) = 204/1 + sana (18) 
var (8%) = 4o®| 9 ast qa ayigt (AM) (48) (A04))} (17) 
(3) = &(s*) = at + Oo =). (18) 
var (8*) = al ate =)" (19) 


where AO; = O44,:-9, and 6 = X6,/n. 














124 The mean successive difference 


It may be mentioned here that formulae (16)-(19) are exact and that (18) and (19) are 
obtainable from the first two moments of the non-central ?. 

Now, for a ‘slow-moving’ shift in the mean, &(A0;)?/o? will be considerably less than 
x(4; — 8)?/o?, and therefore the formulae show that the bias in estimating o is considerably 
greater using s* than using 6? or d. Again, the increase in the variance of the estimate is 
also larger in the case of s*. Two examples of how a slow-moving trend affects the bias in 
the case of the three statistics are given below. 


Example (1). 0; = 4+(0-05)io (i = 1,2,...,m = 15). 


The percentage bias in the estimate d = 0-06. 
The percentage bias in the estimate 4? = 0-13. 
The percentage bias in the estimate s? = 5-00. 


The bias introduced in estimating o by s* is 40 times larger than that introduced by either 
d or 6?.* While it is negligible in the latter it may not be so in the former. 
Example (2). 0; = w+ (0-04: + 0-0047?) o (i = 1, 2,...,n = 10). 
The percentage bias in the estimate d = 0-24. 


The percentage bias in the estimate 6? = 0-48. 
The percentage bias in the estimate s? = 6-54. 


Even in this quadratic trend, the bias is 14 times larger using s*. 


3. THE DISTRIBUTION OF THE RATIO OF THE MEAN SUCCESSIVE 
DIFFERENCE TO THE ROOT MEAN SQUARE 
3:1. Moments 


Von Neumann (1941) has discussed the exact distribution of the ratio of the mean-square 
successive difference to the variance, viz. 6?/s?, where 


n—1 (x. = n (x. —Z)? 
Ga FS Tt HY andt a= 3 St. 
~ n—1 t 2 nN 
This ratio has been used to test the independence of successive observations and thus to 
detect serial correlation. It is possible to use for a similar purpose the ratio 


s | 7 Fist | 
W =" = 2? (20) 


a) 


By proving that W and s are independent it can be showh that W has a property of moments 
which it possesses in common with a number of similar ratios, viz. 








#,(d) 
Bu (W) = ni(~)’ (21) 


* It is to be noted that d and 6? introduce bias of the same magnitude, since while d estimates o, 
6? estimates 0°. 
+ The divisor used by von Neumann is n, not n—1. 





-~ Qs 





A. R. Kamat 125 


where 1; is the moment about zero. [See in this connexion Geary (1936) and von Neumann 
(1941).] Substituting in (21) the moments about zero of d and s, the first four moments 
of W about zero are obtained as — 


a =r(5) : 1-128379° (5) | ; 
Tes TET 


a As Me a 


Me n\n 3 1 3 nm ) mf 





=™tt {1-273240 + 1-052264m-1 — 0-325504m-%, 





, 8 . 1 
ae 5) ) FEET br ar tts 
BB OR Tatas 
m+1 2 


+ (120+2084-24,/2— 120,/3— 72.008") : 


alt 


1 1 
+ (144/34 144 cos-!—. — 192-48 ,/2— 42n) inal 


3 


r(§) L (22) 
{1-43670 + 3-56206m-! + 0-54079m-2 — 0-87067m-3}, 
ent 








me m+1)* (16 
4= - let (96./3— 288 + 64m) 


l 
+ (S#22+-64,/30— 80m + 192,/2 + 2048 — 1152/3 — 576 cos- a) ata 


1 
+ (24an2 — 487? — 160 ,/37 — 9287 + 4032 cos“! —— 


(3 
+ 4512,/3 — 6480 — 1344 2) as 5 
+ (sn —72an® + 168 37 + 20487 + 2304 /2 
+7488 — 5760 ./3 — 6912 cost — +3) z : a 


ee * at 62114 + 8-03870m- + 8-24931m-2— 2-80941m-3 + 2-04141m-4, 


where m=n—1 and a = &(d,d,d,d,) = 2-215484. 





3-2. Approximate distribution of W = d/s 
From the moments about zero given above, ~;, o(W), #, and £, were calculated for 
n = 5,10, 15, 20, 25, 30, 40,50. They are given in Table 4. (It is to be noted that <0, 
i.e. there is negative skewness.) It is seen from this table that the ,, 8, points lie in the 








126 The mean successive difference 


Pearson Type I region, very close to the symmetrical Type II line. For the criterion 
C = 1—(n—1)6?/(2ns?), which is linearly related with 5?/s?, the Pearson Type II curve 
fitted extremely well, as can be seen from Young (1941) and Hart (1942). The present 
author has verified that the 5% points for 6?/s? itself based on the Type II approximation 
coincide with those given by Hart (1942) based on the exact distribution within two (and 
sometimes three) places of decimals for n > 10. It seems probable therefore that a Pearson 
type curve will give a good fit to the distribution of W = d/s also, for n> 10. Table 5 gives 


Table 4. The mean, standard deviation, f,, f, values for W = d/s* 





























n Mi o(W) Ay Bs 
5 1-3421 0-3061 0-009 2-55 
10 1-2229 -2116 013 2-84 
15 1-1890 -1709 ‘O11 2-91 
20 1-1730 -1470 -009 2-94 
25 1-1637 0-1311 0-007 2°95 
30 1-1576 *1194 -006 2-96 
40 1-1501 -1030 -005 2-96 
50 1-1457 ‘0919 -004 2-98 
Note. 3<9. 


Table 5. Percentage points for W = d/s* 











Lower Upper 

n 

0-5 1 2°5 5 5 2°5 1 0-5 
10 0-67 0-71 0-79 0-87 1-56 1-63 1-70 1-73 
15 ‘74 “78 *85 “90 1-47 1-52 1-57 1-60 
20 ‘79 83 88 93 1-41 1-46 1-51 1-54 
25 0-82 0-86 0-91 0-95 1-38 1-42 1-47 1-49 
30 *85 “88 92 96 1-35 1-39 1-44 1-46 
40 88 ‘91 95 “98 1-32 1-35 1-39 1-41 
50 “91 93 97 “99 1-30 1-33 1-36 1-38 



































n —7)2 
* Itshould be noted that following the definition used by von Neumann we have taken s? = & a 


i.e. taken a divisor of n, not n—1. 


the upper and lower 0-5, 1, 2-5 and 5 % points for W based on the Pearson type approxima- 
tion. The 5 and 0-5 % points were calculated from the Pearson-Merrington (1951) tables. 
The 2-5 and 1 % points for n = 10, 15 and 20 were calculated with the help of Thompson’s 
(1941 a) tables of the percentage points for the Incomplete Beta-function, while for the 
remaining sample sizes (n > 25) the normal approximation, which was sufficient for the 
purpose, was used. It may be remarked that for positive serial correlation, only the lower 
significance points are necessary. It is hoped to give some further illustrations of the uses 
of the distribution of d/o and d/s in a later paper. 





on 
ve 
nt 
on 
id 
on 
es 








A. R. Kamat 127 


3:3. Comparison of the distributions of d/o and d]/s 
It is of interest to compare the distribution of the ratio criterion d/s with that of the 
numerator d/o. This comparison does not reveal the familiar properties of a studentized 
statistic, as the denominator s, which has been computed from the same sample as the 
numerator, is not independent of d. In fact, d and s are positively correlated since 


&(d.8) = a(¢ #) = (q) &(s%) = ee ag 


6 (4) 
&(8) 


{&(s*) — &%(s)} = ) var (s) > 0. 


so that cov (d,s) = Bs) 


In consequence of this positive correlation the ratio d/s has a smaller variance than d/o, 
as can be seen from Tables 1 and 4, and, since &(d/s) > @(d/o), the same holds for their 
‘relative variances’, i.e. 

var (d/s)/€*(d/s) < var (d/o)/€*(d/c). 


As nc, &(d/s)-> &(d/o) = s but 
var(d/s) 47+6,/3—21 
var (d/o) 47+6,/3— 18 





= 0-395, 


which is less than 1. It may be added that similar properties hold for the statistics 4?/c? 
and 6?/s? mentioned in § 3-1. 

As will be seen from Tables 2 and 5, a pair of corresponding upper and lower percentage 
points of d/o bracket the corresponding percentage points of d/s. 


Finally, I wish to express my sincere thanks to Prof. E. 8. Pearson and Dr H. O. Hartley 
for their help and guidance in the course of these investigations. 


REFERENCES 


ARLEY, N. & Hatp, A. (1950). Math. Tidsskr. B, p. 86. 

Geary, R. C. (1936). Biometrika, 28, 295. 

GuEst, P. G. (1951). Suppl. J. Roy. Statist. Soc. 13, 233. 

Hart, B. I. (1942). Ann. Math. Statist. 12, 445. 

Kamat, A. R. (1953a). Biometrika, 40, 

Kamat, A. R. (19536). (Unpublished results.) 

KEEN, J. & Paas, D. J. (1953). Applied Statistics, 2, 13. 

NaBeEyaA, 8. (1951). Ann. Inst. Statist. Math. 3, 2. 

NaBEyYA, S. (1952). Ann. Inst. Statist. Math. 4, 15. 

Pearson, E. 8. & Merrineton, M. (1951). Biometrika, 38, 4. 

Pearson, K. (1931). Tables for Statisticians and Biometricians, 2. Biometrika Trust, London. 
THompson, C. M. (1941la). Biometrika, 32, 151. 

TuHompson, C. M. (19416). Biometrika, 32, 187. 

von NEUMANN, J. (1941). Ann. Math. Statist. 12, 367. 

von NEUMANN, J., Kent, R. H., Betirnson, H. R. & Hart, B. I. (1941). Ann. Math. Statist. 12, 153. 
Youna, L. C. (1941). Ann. Math. Statist. 12, 293. 

Wo tp, H. (1948). Tracts for Computers, No. 25. Cambridge University Press. 








[ 128 ] 


THE EFFECT OF UNEQUAL GROUP VARIANCES ON THE F-TEST 
FOR THE HOMOGENEITY OF GROUP MEANS 


By G. HORSNELL 


1. INTRODUCTION 


David & Johnson (1951) have studied briefly, inter alia, the effect of unequal group variances 
on the nominal significance levels of the F-test (variance ratio) when used to investigate 
differences among a number of group means. The present note takes those calculations 
a step further and also considers the effect on the power of the test in a special case. 

If a total of N observations of a random variable are divided into / groups, with frequencies 
n,(t = 1, 2,...,), where N = ws n, if S, and S, are the usual between group and within group 


sums of squares and if F, is the 100a % significance level of the F-distribution with vy, = 1—1 
and v, = N —I degrees of freedom, then the David-Johnson method consists in approxi- 
mating to the distribution of 1-1 
t= 8,-| FF 8). (1) 


Three types of curve have been used in deriving the approximations: 

(i) the Edgeworth series in the form 

K3(X) 
3! 


K(X) 
4! 


10x2(X) 
6! 











14x? 
p(X) = Van) ° bx [1+ 
where X = (x—Kk,(x))/«4(x), «,(x) is the rth cumulant of x and the H-functions are Hermite 
polynomials; 

(ii) Johnson’s (1949) unbounded curve S,,, and the log-normal curve; 

(iii) a Pearson curve, usually of type IV. 

The curves were fitted using the first four moments of x derived from the moments of 
S, and 8, given by David & Johnson. Apart from unequal group variances, ky, the usual 
assumptions made in the analysis of variance are supposed satisfied. 

The present investigation can only be regarded as exploratory in the sense that it uses 
the general formulae to illustrate certain special cases. The number of groups has been 
confined to/ = 4, and the six different sets of values for the group variances x, (¢ = 1, 2, 3, 4) 
considered by David & Johnson have been used. These values are as follows: 


H,(X) + A(X) + H,(X)}, (2) 





Set Ka Ko Ke3 Ko 





ee tt et et 
—— i DD 


ZevpQnp 
wre tto~ 
oo 09 69 bo to bo 


























G. HorsnELL 129 


2. EFFECT ON SIGNIFICANCE LEVEL WHEN THE GROUP FREQUENCIES ARE EQUAL 


Four cases with an equal number of observations, n,, in each group, namely, 5, 10, 15 and 
sT 20 respectively, will be considered first. The values obtained for the actual tail areas cut 
off when using the nominal 5% and 1 % significance levels of F are given in Table 1. 
A column for the case of equal variances is also included to give an indication of the accuracy 
of the method. The values obtained from approximating to the distribution of x by the three 
types of curve are in fairly close agreement, the Pearson curve, where used, nearly always 
giving a result closer to that of the Sy, than to that of the Edgeworth series. There does not 
appear to be any change in the error of the significance level (which in every case is small) as 




































































3eS the sample size increases. 
ite 
ns Table 1. Equal groups 
. (a) 5% nominal significance level 
ies 
up 
Unequal variances 
-] Curve Equal 
xi- Ne fitted variances 
Set A Set B Set C Set D Set E Set F 
(1) eet 
5 Edgeworth 0-051 0-056 0-656 0-054 0-063 0-060 0-062 
Uv -052 -058 “058 -055 -069 -064 -066 
Type IV —_ aa — _ 068 —_ — 
10 Edgeworth 0-053 0-059 0-058 0-056 0-064 0-062 0-063 
(2) Sy -050 -055 055 -052 -063 059 -061 
Type IV -050 sees ie ats 063 — ae 
ite 
15 Edgeworth 0-053 0-058 0-057 0-056 0-062 0-061 0-061 
v -049 -054 054 -052 -062 -058 -059 
Type IV —_— — —_ — -062 _ — 
of preter cc§ 
1 20 Edgeworth 0-053 0-057 0-057 0-055 0-060 0-059 0-060 
on So -049 -054 054 -052 -061 -058 -059 
Type IV 049* | — _ — 059 tie sno 
3es 
en * Value obtained from Pearson type V curve as the /,, /, point for z lay close to the type V line. 
4) aa ee 
(b) 1% nominal significance level 
Unequal variances 
* Curve Equal 
: fitted | variances 
Set A Set B Set C Set D Set E Set F 
5 Sy 0-010 0-013 0-012 0-011 0-018 0-015 0-016 
10 Sy ‘010 013 ‘013 ‘011 018 015 016 
15 Sy -010 ‘012 012 ‘O11 ‘017 014 015 
20 Sy -009 ‘012 ‘012 ‘O11 ‘016 014 “015 



































Biometrika 40 9 














130 Effect of unequal group variances on the F-test 


3. EFFECT ON SIGNIFICANCE LEVEL WHEN THE GROUP FREQUENCIES ARE UNEQUAL 


Welch (1937) considered the effect of unequal variances on the t-test for the difference of 
two population means. He found that for n, = n, = 10 and a nominal significance level of 
5 %, the true level lay approximateiy between 5 and 6-5 % for all values of k2,/Kg9. For 
n, = 5 and n, = 15, however, he found that the test could be seriously misleading, and 
showed that the significance of the difference between the two means would tend to be 
underestimated when k.; < Kg. and overestimated when Kk, > Ke: for a given nominal level 
of significance. This problem has also been considered by Gronow (1951, 1953). 

Table 2 shows that these conclusions can be generalized to the case of more than two 
groups. The sets of variances considered are set A and set D. The error in the significance 
level appears to depend only on the ratios of the numbers in the groups, n,/N, and not on 
their absolute magnitudes. For the error in the significance level to be a minimum, it is 
necessary to take slightly more observations in the group with the high variance. An 
empirical rule seems to be that the ratios should lie about half-way between equality and 
the ratios of the standard deviations in the groups, for the cases considered here. A similarity 
may be noted between this condition and that in stratified sampling, where, to obtain a 
linear estimator of a population parameter with minimum variance, it is necessary to take 
a sample from each stratum proportional in size to the standard error and the size of the 
stratum. 


4, EFFECT OF UNEQUAL VARIANCES ON THE POWER OF THE TEST 


The David-Johnson method will now be used to examine how inequality of variance affects 
the power of the test, that is, the probability that it will detect real differences between the 
population group means. Again, it is only possible to examine particular cases. For the 
standard case of ‘normal theory’, where x,, = o? for all ¢, the power of the test depends on 
(a) vy, = 1—1 and vy, = N —1, the between- and within-group degrees of freedom, and (5) on 
a single parameter which has been termed the non-central parameter, measuring the degree 
of heterogeneity among the group means. If the population group means are k,,(¢ = 1,2,...,1), 
if kK, = DY (n,x n, and if we write 

1. = E (me y)EMy eee 
then Tang (1938), in preparing his tables, used as the non-central parameter, 


$= { m,07/(o*(v, + 1))}*. (3) 


This notation was repeated in their power charts by Pearson & Hartley (1951). We shall 
write £(¢) for the power associated with the standard conditions. When the within-group 
variances K», differ, several forms of comparison may be of interest, and these are discussed 
below. 

We shall consider only the set of variances D, i.e. with xy, as (1, 1, 1, 3), and with these we 
take the four combinations of the group frequencies n, shown in Table 4 (with Xn, = 40) and 
two different cases of divergent means, namely, 

Case (i): C, = C; = C, but C, different, i.e. there is a divergent mean in a group with low 
variance. 

Case (ii): C, = C, = C, but C, different, i.e. the divergent mean is in the group with high 
variance. 

The first step in the investigation is to confirm that the David-Johnson method provides 
close approximations to the power when the «,, are equal. A comparison is made in Table 3 








G. HorsnELL 131 


Table 2. Unequal groups 
(1) Set A (ky’s:1,1,1, 2). (Tail areas estimated from Sy curves) 











Nominal significance level 

ny Ns Ng % 2 Ny 
5% 1% 
8 8 8 16 40 0-032 0-006 
10 10 10 10 40 055 013 
il ll 12 6 40 077 020 
16 16 16 32 80 0-030 0-005 
20 20 20 20 80 054 012 
22 23 23 12 80 076 020 





























(2) Set D (ky’s:1,1,1,3). Sn, = 40 
t 














Nominal significance level 
™% Ng Ns % 5% 1% 
Edgeworth Sy Type IV Sy 
7 7 7 19 0-021 0-017 0-016 0-002 
9 9 9 13 049 043 — 004 
9 9 10 12 -054 -049 _— -012 
9 10 10 11 0-059 0-056 —_— 0-015 
10 10 10 10 064 063 063 018 
12 12 12 4 -103* “132 134 052 
































* The tail ordinates of the Edgeworth curve had a ‘hump’ in this case and would have been negative 
for part of the range but for the term in «3(X). 


(3) Set D (kg,'s:1,1,1,3). Em, = 80. (Tail areas estimated from Sy curves) 











Nominal significance level 

ny Ng Nz % 

5% 1% 
13 13 14 40 0-013 0-002 
17 17 17 29 026 005 
18 18 18 26 042 -009 
19 19 20 22 053 014 
19 +20 20 21 0-057 0-015 
20 20 20 20 061 016 
24 24 24 8 131 048 



































132 Effect of unequal group variances on the F-test 


which is clearly satisfactory, and gives some confidence in using the method when the 
variances are unequal. In the further work, the Edgeworth series has been used, except 
where indicated by notes below Table 4. 


Table 3. Equal variances. > n, = 40. Comparison of approximations to true 
t 


significance level and power 





Power 


Significance level 





¢g=1 g=2 b= 25 





True level 0-050 Tang 0-325 0-904 0-987 
Edgeworth series -053 Edgeworth series +332 -901 -986 
Sy -050 Type III curve +329 -903 -986 
Type IV -050 Log-normal curve +328 —_ — 























We have now to consider what comparison may be of interest in illustrating the effect of 
unequal variances. In the first place, we may compare the power computed as described 
for cases (i) and (ii) with that obtained if Tang’s tables or the Pearson-Hartley charts are 


entered with i oe {= n,C3/(o4,(v, + 1))}4, (4) 


where 0% is the common low variance of the first three groups (taken as unity in the calcula- 
tions). In this way we can judge how, in the cases illustrated, the unrecognized presence of 
a single high group-variance will reduce the chance of detecting a divergent mean value, 
using the ordinary method of analysis. 

An alternative approach is to replace o? in equation (3) by k,, = EmKalN, i.e. to enter 


the power function table with 
=.= ib C3] (Kg, (¥, + 1))}. (5) 


For small differences among the x,, it would be expected that the use of x, in (5) would 
provide a good approximation to the true power. If the power function is to be used, in 
conjunction with a rough estimate of within-group variability derived from past experience, 
to explore in advance the size of samples required to make a critical experiment worth 
while, then this rough estimate may well approximate to the average variance x,. The 
example we have taken with the x,, having values (1, 1, 1,3) is perhaps more extreme than 
would be expected under these conditions. 

Table 4 gives the appropriate comparisons. For both cases (i) and (ii) the C, were chosen 
for conveniceice to make the ¢, of equation (4) assume exact values of 1, 2 and 2-5. For given 
&n,C} and hence fixed ¢, (since oj = 1), ¢, will be constant for a comparison between cases 
(i) and (ii) within a cell of the table, but will alter with the group frequencies in passing down 
a column, since kK, is a weighted mean of the xy. 

Taking first the true power (as calculated from the approximations), we see that within 
a given cell of the table the true power is always less for case (ii) than for case (i), apart from 
one exception where the power is small. This means that for a worth-while power, we are 
less likely to detect a divergent mean when this occurs in a group with large variance. As has 





the 
apt 








G. HorsnrELL 133 


been seen already in connexion with Table 2 (2), when n, is well above or below the average 
group frequency of 10, the significance level is seriously affected for this case D. 
Consider now comparisons with the power £(¢) obtained from Tang’s tables, based on 
taking ¢ = ¢,. We find again, as we should expect, that for a given set of C, an increase in 
the variances of one group lowers the power of the test. Except in the case where ¢, = 1 and 
the n, are (12, 12, 12, 4), where the actual significance level is much higher than the nominal 
5 %, the power for cases (i) and (ii) is always lower, and generally much lower than f(¢ = ¢,). 


Table 4. x,,’s:1,1,1,3. Sn, = 40. Nominal significance level 5 % 
t 




















Real Approximate power based on Edgeworth series 
ea 
tin significance 
i * 
wie tes $=¢,=1 $=$,=2 $=$,=25 

1 7|£z 0-021 | Approx. power (i) 0-128 | Approx. power (i) 0-576 | Approx. power (i) 0-835 
2) 7| Sg 0-017 (ii) 0-115 (ii) 0-540 (ii) 0-768 
3| 7]| TypeIV 0-016 | #(¢=¢,=1) 0-325 | A(¢ = $, = 2) 0-904 | B(6 = $, =2°5) 0-987 
4/19 B(g = $,=90- 716) 0-18 | P(d=G,= 1-432) 0-61 | A(6=G,=1-790) 0-82 

1 9|£z 0-054 | Approx. power (i) 0-211 | Approx. power (i) 0-724 | Approx. power (i) 0-918 
2| 91] Sy 0-049 (ii) 0-201 (ii) 0-644 (ii) 0-830 
3|10|TypeIV — | P(¢=¢,= y) 0-325 | Kd=¢,= 0-904 | B(¢ = $,= 2-5) 0-987 
4/12 B(d@=G,=90-791) 021 | B(G=G,= re ‘581) 0-71 | B(6=¢,=1-976) 0-89 
1|10| Zz 0-064 | Approx. power (i),f 0-216 | Approx. power (i) 0-764 | Approx. power (i) 0-936 
2/10] Sy, 0-063 (ii),t 0-233 (ii) 0-672 (ii) 0-846 
3 | 10 | Type IV 0-063 | #(¢=¢,= i) 0-325 | (6 = ¢, = 2) 0-904 | B(¢ = $, = 2°5) 0-987 
4/10 Bid = G,=0-816) 0:23 | A(6=G,=1-633) 0-74 | B(6=—G,=2°041) 0-91 
1/12|2 0-103 | Approx. power (i) 0-365 | Approx. power (i) 0-880 | Approx. power (i) 0-979 
2; 12] Sy 0-132 (ii),t 0-339 (ii) 0-759 (ii) 0-892 
3 | 12 | TypeIV 0-134 | #(¢=¢, = 1) 0-325 | B(¢ = ¢, = 2) 0-904 | B(6=¢,=2°5) 0-987 
4| 4 (og = = : 0-913) 0-28 | B(d6=¢,= 1826) 0-84 | PiPG=gG,= 2-282) 0-96 


























* Three methods of approximating to the real significance level have been used: Z, with the Edgeworth 
series; Sy, with Johnson’s (1949) curve; type IV, with the Pearson curve. 

t Only the Edgeworth series was used i in calculating the power, except in the cases marked (i),, (ii), (ii)3; 
Sy curves and the log-normal curve were used in addition to the Edgeworth series to calculate the power here, 
the values obtained being: (i), Sy 0-228, (ii), Sy 0-245, (ii), log-normal 0-354. 


However, more instructive comparisons are made taking ¢ = ¢,. Here in a number of 
cases {(¢ = ¢,) lies between the power for case (i) and case (ii). For low values of the power, 
say less than 0-50, its values may be considerably influenced by a wrong start to the curve 
at ¢ = 0, i.e. by the difference between the actual and nominal significance levels. But it 
will be seen that as the power becomes large so that there is a worth-while chance of estab- 
lishing the significance of differences in means, the tabular power based on the weighted 
mean variance gives a very reasonable approximation to the true power, particularly when 
the divergent mean is in @ group with lower variance. 

Comparison is made easier by plotting the power curves as in Figs. 1 and 2 using ¢ = ¢, as 
abscissa. The curves for the frequency combination (9, 9, 10, 12) have not been drawn in as 
they fall very close to those for equal groups, (10,10, 10,10). The standard curve, with 
ordinate £(¢), is shown in both figures by a broken line. 








134 





Effect of unequal group variances on the F-test 


































































































= rr l ag a Ee 
Fig. 1. Divergent mean in a group with low variance we ee 
kx:(1, 1, 1, 3) A -~ 
0-9 
m m ny M% iW 7 
Key: | 77 719 WG 
. 10 10 10 10 
“_—" m 124242 4 77, 
——— Equal variances il a 
07 7 Y, 
3 A ! 
2 0-6 
; Ga 
S 
¢€ 
Se 0-5 Fr tA 
3 Ws 7, 
2 0-4 y, ih 
= 
UV 
03 Z Y 
MI / 
0-2 A 
fae ae gf y 
—— — 2 
pe WA | 
pert 
0 | | a i ! | i ! J i | 
0 01 02 0:3 0-4 05 0:6 0:7 0-8 09 1:0 1:1 1:2 1:3 1-4 1:5 1:6 1:7 1:8 1:9 2:0 214 2:2 2:3 2:4 2:5 
Pf 
Fig. 1. 
bi. l l l T (== 
Fig. 2. Divergent mean in the group with high variance a 
kui(1, 1,1, 3) a, 
0-97— Ke 
mh m m sf) — 
Key: | 77 719 7 a 
0-6 10 10 10 10 Ata 
Wl 121212 4 7 
— — — Equal variances Y/Y 
0-7 7 
xf 
/ 
0-6 4 





° 
wu 





yor 
Clow 


a a se 
= 


\ 





Chance of significant result 








0-1 





- 
——- 


















































J | | | ! i a | | | 
0 01 02 03 0-4 05 06 07 08 09 1-0 1:4 1:2 1:3 1:4 1:5 1:6 1:7 18 1:9 20 24 222324 2:5 


P=: 


Fig. 2. 











G. HorsNELL 135 


For case (i) (Fig. 1), the curve for frequencies (10, 10, 10, 10) is very close to the standard. 
The curves for (7,7, 7, 19) and (12, 12, 12, 4) lie, respectively, below and above the other two 
curves for low ¢, owing to the different values of the significance levels. Allowing for this, 
however, we see in the different slopes of the curves the advantage of having more observa- 
tions in the group with high variance. 

For case (ii) (Fig. 2), the curve for equal frequencies falls away from and below the 
standard curve as ¢ increases. The curve for frequencies (7, 7,7, 19) starts at ¢ = 0 with an 
ordinate of 0-02 instead of 0-05, but has passed above the equal frequency curve when ¢ = 2, 
though it has not caught up with the standard. On the other hand, the curve for frequencies 
(12, 12, 12, 4), inspite ofits high start at 0-13 when ¢ = 0, shows the lowest power when ¢ = 2. 

It is difficult to summarize these results in any simple form, but perhaps the following is 
a fair statement of the position. In general, where there is no very clear information as to 
how any heterogeneity in group variances is apportioned, it will be best to work with equal 
group frequencies. If, however, there are definite grounds for believing that the observations 
in one or more groups have a variance above the average, we must avoid taking less than 
the average number of observations from these groups. This is because we do not want to 
run the risk of claiming a significant effect (when there are no real differences in means) 
considerably more often than has been allowed for on the basis of the significance level 
chosen. It will, indeed, be worth while taking a few more observations from the groups whose 
variance we believe to be above average, and there is no great danger of overdoing this. For 
there can be no objection to the risk of wrongly claiming significance being less than we have 
allowed for (e.g. for the power curve to be below the standard near ¢ = 0), if the true curve 
makes a good recovery as {(¢ = ¢,) increases above 0-50. 


5. ALTERNATIVE TEST PROCEDURES 
Where the group variances are known, the statistic 


L= ~ o(ky— k,)* (6) 


may be used to test for the homogeneity of t = 1,2,...,1 group means, where w, = n,/Ky,, 
ky = the tth group sample mean and k, = Dw, ky/> o, 
t t 


It may easily be shown that L is distributed as x? with /— 1 degrees of freedom when the 
group means are all equal. In another connexion, the L-test has been found useful in com- 
paring estimates of the constants of a number of probit lines (see, for example, Miller, Bliss 
& Braun, 1939). 

When the estimates of the group variances are based on sufficiently large numbers of 
observations, w, may be replaced by w,; = n,/ky,, where k,(¢ = 1, 2,...,1) are the estimated 


variances. L'= ~ wi(ky “ ki), (7) 


where ki = > w,k,,/> w, will then be distributed approximately as x? with /— 1 degrees of 
t t 


freedom, and the approximate power of the test may be obtained from the non-central 
x? distribution. For small sample sizes the above test will not be very trustworthy. James 
(1951) has devised corrections to the significance levels of x? depending on the estimates 
of group variances, while Welch (1951) uses an F-distribution to approximate to the 
distribution of L’. Both procedures should provide a better test. 








136 





Effect of unequal group variances on the F-test 


The fitting of S, curves in this investigation was facilitated by formulae due to J. Draper 


(1952). 


The author’s thanks are due to Prof. E. 8. Pearson, Dr N. L. Johnson and Dr F.. N. David 
for suggesting the topic of this paper and for assistance in its preparation. Acknowledgement 
is also made to the Chief Scientist, Ministry of Supply, for permission to publish this note. 


REFERENCES 


Davin, F. N. & Jonnson, N. L. (1951). Biometrika, 38, 43. 
Draper, J. (1952). Biometrika, 39, 290. 

Gronow, D. G. C. (1951). Biometrika, 38, 252. 

Gronow, D. G. C. (1953). Biometrika, 40, 222. 

James, G. S. (1951). Biometrika, 38, 324. 

Jounson, N. L. (1949). Biometrika, 36, 149. 

Mixter, L. C., Briss, C. I. & Braun, H. A. (1939). J. Amer. Pharm. Ass. 28, 644. 
Pearson, E. 8. & Harttey, H. O. (1951). Biometrika, 38, 112. 
Tana, P. C. (1938). Statist. Res. Mem. 2, 126. 

Wet cg, B. L. (1937). Biometrika, 29, 350. 

We cz, B. L. (1951). Biometrika, 38, 330. 





id 
it 





[ 137 ] 


THE ESTIMATION OF POPULATION PARAMETERS FROM DATA 
OBTAINED BY MEANS OF THE CAPTURE-RECAPTURE METHOD 


Ill. AN EXAMPLE OF THE PRACTICAL APPLICATIONS OF THE METHOD 


By P. H. LESLIE, DENNIS CHITTY anp HELEN CHITTY 
Bureau of Animal Population, Department of Zoological Field Studies, Oxford 


CONTENTS 

PAGE 
1. Introduction . ‘ . : 4 ‘ ‘ - ee 
2. Description of the area and field methods k ‘ ‘ : : - 138 
3. Statistical analysis . , ! . . . : “ y . . 
A. The Microtus population . : : ; : : : : . 142 
(i) Preliminary analysis . : 4 : N i - 148 
(ii) Data for the marked individesls é - 146 

(iii) Methods of as et the death-rate endl of constructing life- 
tables . st : = ‘ ‘ ; ; : - eet 
B. The Clethrionomys siiniihaahink ‘ ; : ; ° : : - 156 
(i) Preliminary analysis . 4 . ‘ ‘ ; 4 - 156 
(ii) Data for the marked individuals . é : s . - 159 
(iii) Estimation of total numbers . i ns . r - 160 
4. Discussion of results k Z “ ‘ ° a é = - 163 
5. Summary ° . ° ° . . . . ; : : - 165 
Appendix . . . ° ° re ‘ ° : ° . - 166 
References . : r ‘4 : * . 4 4 - 169 


1. INTRODUCTION 


The theory of estimating various population parameters from data obtained by the capture- 
recapture method has been discussed in two earlier papers (Leslie & Chitty, 1951; Leslie, 
1952), with special reference to the types of problem which may arise in applying this method 
to populations of small mammals. It was also shown, by means of an actual sampling experi- 
ment carried out on an artificial population of counters, that we should obtain satisfactory 
estimates of these parameters, together with their variances, from equations which are 
based mathematically on a deterministic model of the population, provided that the 
sampling of the individual members is wholly at random and that the size of the samples 
is not too small. We now propose to examine in some detail the results of an experiment in 
the field, in which two populations of small rodents, namely, the field vole Microtus agrestis 
and the bank vole Clethrionomys glareolus, living on the same area, were sampled by means 
of a live-trapping technique over a period of very nearly two years, between May 1948 and 
April 1950. 

The most important question which immediately arises in the application of this form of 
analysis to any set of field data is whether the basic assumptions which are made in the 
theoretical development were in any way realized. Thus, all the equations which have been 
given for estimating the different parameters are based on the assumptions that the sampling 
of the individuals was entirely at random, and that all classes of marked and unmarked 
animals were being caught with equal facility. It is necessary, therefore, in the first place 








138 The estimation of population parameters from capture-recapture data 


to examine a set of data with this point particularly in mind, for if there is a complete 
breakdown in these assumptions, then no valid estimates can be made of the parameters 
we wish to determine. The results presented here will show how dangerous it may be to 
apply theoretical methods of estimation to unsuitable data, without some such form of 
preliminary analysis. 

The main purpose of the present paper is, then; to see whether any of the methods of 
estimation which have been developed on a purely theoretical basis can be applied to a set 
of field-data obtained by trapping a population of living animals. We shall not be con- 
cerned either with the purposes of the original inquiry, or with the biological or ecological 
interpretations which may be placed on any of the results. These, together with the reasons 
why certain technical procedures were adopted in the field, we propose to discuss in greater 
detail elsewhere. 

Finally, a word as to nomenclature. It is evident that in trapping a population of wild 
animals we are not obtaining a sample covering all the possible age-groups alive at a given 
time. Thus, by this method of sampling, we can have no knowledge of the number of young 
in the nest, and it is not until these young grow up and become active that they gradually 
become part of the trappable population. It would be tedious and cumbersome to refer 
each time to the ‘ population at risk of capture’, and it must be clearly understood that when, 
for the sake of simplicity, we use instead the term ‘total population’, or ‘the total number 
of individuals alive at time ¢t’, we mean only those members of the population which are 
capable of entering the traps. In the case of a species with an intermittent breeding season, 
however, these members may constitute the entire population on the area at certain times 
of the year. Similarly, for the sake of brevity, we shall use ‘dilution-rate’ and ‘death-rate’ 
as comprehensive terms which would also include any rates of immigration and emigration 
to and from the area. 

Since we shall have frequentiy to cross-refer to various sections of the two previuus papers 
(Leslie & Chitty, 1951; Leslie, 1952), these references will in future be given in the shortened 
form I, §x and II, § y respectively. 


2. DESCRIPTION OF THE AREA AND FIELD METHODS 


The trapping area (Fig. 1) of about 4 acres lay on the north-east side of Lake Vyrnwy, 
Montgomeryshire, and had been planted in 1947 with Norway spruce, which was about | ft. 
high at the time of our studies. The boundaries of the area were formed by the lake, a stream, 
and mature plantations or scrub, and a road ran through it from south-east to north-west. 
The Microtus population was therefore almost entirely self-contained, the only open channel 
for any immigration or emigration being along the road at either end. The mature-plantation 
and scrub habitats, however, did not prevent other species from moving to and from the 
trapping area. 

The field methods were similar to those used in a previous study (Chitty, 1952), the chief 
differences being that sampling points were allotted at random and that traps were pre- 
baited for 48 hr. before being set. We hoped by these procedures to avoid the error, met with 
before, of drawing samples which were biased by an excess of marked animals (Chitty & 
Kempson, 1949). 

The area was divided into five sections (0, 1,2,3,4), and iron railings along the road 
formed a convenient baseline to which points within the sections could be referred. Pre- 








P. H. Lesxiz, DENNIS CHITTY AND HELEN CHITTY 139 


baiting points were determined by means of a table of random numbers, each section being 
sampled in proportion to its area. From May to July 1948 positions were chosen at random 
within sections, all of which were trapped simultaneously, though lightly, between 9a.m. 
and dusk. On each of the following two or three days the traps were lifted and redistributed 
at sites randomly chosen and already prebaited. In September and October 1948 we still 
trapped on the same system except that the traps were set overnight. In November 1948, 
at the suggestion of Mr M. J. R. Healy, a system of stratified random sampling was adopted, 
the most usual method being to allot either one or two positions at random within each 
10 yd. square. If a point chanced to fall beyond the margin of an incomplete square no trap 
was prebaited, and at points on the area where no prebait was taken, no trap was set; 


JIZZ. LAKE VYRNWY eer 


ie. # Si ie fee con 

LLS. | KS gg IE 

‘ + a L, - 4 a oy L + 7 4 4 L 5 > ty L VITTOT LA Ki 
| 1 2 | 3 { 4 a 


gee i 
Section | No. 0 










































































Fig. 1. Sketch-map of the trapping area with co-ordinates shown at 20 yd. intervals. Microtus was absent 
from all hatched areas, but those on the lake side of the road are scrub, suitable for Clethrionomys. 
Other hatched areas are mature woods in which Apodemus only is known to have occurred. The 
road is 15 ft. wide with 3 ft. grass verges which are included in the trapping area. B = blackthorn. 


otherwise either one or two traps were used at each point. The area was now trapped in 
sections on different days, the order being determined at random. From November 1948 
to May 1949 traps were set in the afternoon and visited next morning, but from June 1949 
they were set during the morning and visited twice within the next 24 hr. 

This general plan was modified in various respects, only two of which need be men- 
tioned. (a) After the first trapping in March and April 1949 the whole area was retrapped in 
a single operation, one trap being allotted at random within areas of 20 ydx 10 yd. (6) In 
May and June 1949 we removed part of the Microtus population which ranged over the 
area south-east of the blackthorn, in sections 0 and 1 (Fig. 1). 

The animals were marked by means of numbered leg rings, except in the first summer, 
when a system of toe-clipping or ear-marking was adopted. In going through the individual 
records we have found no reason for believing that any errors were caused by the loss of 
rings or other marks of identification, except in the case of the sample taken from the 
Clethrionomys population in September 1948. 

Some idea of the sampling effort may be gathered from Table 1, which shows that the 
area was well covered by traps. It will be seen that the total number of traps entered, in- 
cluding those found sprung but unoccupied, is always less than the number of occasions on 
which capture might have occurred, i.e. when allowance is made for the number of times 
each trap was visited and reset. These facts indicate that at least one point is likely to have 
fallen within the range of each animal (individual Microtus ranged over areas of at least 
10-15 yd. diameter); also that competition for traps was not so severe that the activities 
of one species seem likely to have introduced serious error into the sampling of another. 
There was a high proportion of immediate recaptures when the traps were lifted and reset 














140 The estimation of population parameters from capture-recapture data 


88sI 
061 


68 Tét FOr +9 +9 68 1¥ 
66 SéI 90T L49 69 ss cP 


(gy) posvojor [ez07, 
(?9) yore [eIOy, 





(*n) poxreurun [8370], 
(’) poxreur [8407, 








7 
= 
_ 





“AON 
*ydoag 
Aine 
eunr 
Avy 
udy 
“IBVW *6F6T 
“AON 
490 
*adeg 
Ane 
oun *8h6T 





-idy 





"AON 





*ydeg 





Aine 





eunr 





Sew | ady “Ie “AON, "490 *ydog Aine 


| 




















eune | 








(2) ernqdeo jo yQuoyy 





(x) porngdeo 4s] 
uoym YOR, 








(paursquos & pun PP) *("w) paunjdno sn) asam fay) aours yoasaqur ay) 07 burpsooon sainjdnoas snyosory fo wowngyjsig *Z 8[qQe I, 


‘0°61 [dy ut 6 '8F6T 10q0300 Ut Og :408 e10m Loy} ABp oY} UO poyrstA osye oro sdvsy ourog | 


*pouturexe uoym Aydure ynq ‘Zunids sdery, , 




















I 4 896 cr9 re ¥3 81 | €I #1 8% 9% ints "ady 2-81 *0961 
I z SL¢ 6L9 6&8 &% 1g LI 0% 8¢ 061 i - ‘AON 12-91 
I I 10% OFZ 9LZ 3 FI e¢ | rat oF L&I 9-1 ‘qdog Z—“Sny 6% 
I I 268 #3 SFG €&% 6I 99 «6|~Ct«OO 6% 86 Ll AINE FZ-0G 
I I 69 SIt 681 1é z | £¢ z Lal Lit OI eunr 0Z-9T 
I 0 88 032 LLI £3 1 LI € 1Z 66 O-1 ABW $-06 
I 0 88% C08 8Iz rd . 1.4 | Se ¥% rr | ¥1 "ad y 92-€2 
I 0 PLZ G0 FOG FI | €¢ 0 | 9 SZ 901 L-F “IBW €6-0Z *6F6T 
I 0 £0Z £0G cgl c¢ 0 SI £ GS L9 G+] “AON 96-42 
I 4 Ost Ost 961 0€ 0 6E ¢ €¢ 69 ol “PO OI-EI 
I 0 9ZE 96E 0SZ cE FP €I éI +9 ¢8 eI “4dog 9-1 
0 I | cé6I S61 core | él It eI 0 ef cP CT AINE 0$-97 
0 a 1 9ES 9€Z 1&3 0G I? 6 0 rE Lor oun 61-91 *8F6T 
| ;ta ay E 
| } } | 
But ysnp 04 | snwuap shuou & n 
Sind Se reqeid | porayue | Sunids somnjydeo | “OS -ody | -onayzayg | “OA es 
—— ! box een gene fin 0 * edex: | “encrerp | Burddes soyep Suiddei 
Le jo ‘On ‘ou ah joo : | poe a mu08M40q =e - 
dex yous 03 | TOL | N | jo ‘on | peddex spenprarput jo “ony yeazoquy 
SHSIA JO"ON | 
| 















































synsas burddn.y fo hammungy *{ eq I, 






































Total catch (C,) 
Total released (R,) 











P. H. Lestrz, Dennis Curtry AND HELEN CHITTY 141 


at other points on successive days in the summer of 1948 and in March and April 1949. At 
other times the number of these recaptures was minimized by putting the animals first 
captured into tins (containing a supply of food and bedding) until the traps had been reset 
elsewhere. In assembling the data these immediate recaptures within the period covered 
by any one field trip were not counted. 

It will be seen from Table 1 that the majority of the animals trapped consisted of Microtus, 
followed by Clethrionomys and the shrews, Sorex araneus, S. minutus and Neomys fodiens, 
these five species being found at points scattered all over the area. The shrews, however,were 
mostly found dead in the traps and few were released alive. The wood-mouse, Apodemus 
sylvaticus, a relatively wide-ranging species, occurred in small numbers as an invader from 
its natural habitat outside the area. The only data suitable for analysis were thus obtained 
from the Microtus and Clethrionomys populations. 


3. STATISTICAL ANALYSIS 


The first trapping took place on 11-14 May 1948 and the last on 18-24 April 1950, but since 
section 4 of the area was omitted on the first occasion, the origin of the sampling chain was 
taken for the purpose of this analysis at 16-19 June 1948. The samples were taken at 
unequal intervals of time, and although this in no way affects the methods of estimation 
which will be employed, it is necessary to have some approximate and simple time scale, 
so that a parameter such as the survival factor, determined for a particular interval, may 
be expressed in some unit of time for comparative purposes. Working in units of 28 days 
from an origin at 16-19 June 1948, the scale in the second column of Table 1 was found to 
correspond satisfactorily with the true dates at which the samples were taken. In addition, 
a purely arbitrary scale is a great convenience in order to simplify the equations for esti- 
mating the various parameters and the actual computational procedure. Thus, commencing 
in June 1948, the trappings were carried out at fo, t,, tg, ..., ty2, and we will label the successive 
samples with these suffixes, so that the June sample is regarded as having been taken at 
t = 0, the July sample at ¢ = 1, and so forth, the chain ending with the sample taken in 
April 1950, at ¢ = 12. 

Because a number of animals died in the traps or were purposely removed from the area, 
it is necessary to adapt some of the equations and methods given in the previous two papers 
in order to allow for these losses.* The corrections which must be made are, however, 
relatively simple, and the same symbolism with slight modifications can be used as in the 
theoretical development given earlier. Thus, the results of each trapping will be based on 
the numbers of animals captured, either alive or dead, when the traps came to be examined. 


Let C, = the number of individuals captured at time ?, 
d, = the number dead in the traps or removed from the population, 
and R, = the number released alive, so that C,—d, = R,. 


The C, individuals captured at time ¢ can be divided into the number u, which had not been 
previously captured, and the number s, which had been captured at least once. Throughout 


* In all the tables given in this paper, these losses are indicated by a superscript to the actual 
number caught. Thus, an entry such as 30‘ is to be read as: 30 animals were captured of which 4 were 
either found dead in the traps or removed from the population. 











142 The estimation of population parameters from capture-recapture data 


the present work, except in one instance, these recaptures were grouped according to the 
interval of time since they were last captured (Method B: I, § 4). Thus, if 


m., = the number of animals captured at time ¢ which were last captured at time x 
(x = 0,1,2,...,4—1), 
t-1 
we have Um, =s and s,+u,=C, 
z=0 
We will also define 


y, = the total number of animals captured at least once which are alive in the popula- 
tion as a whole at time ¢, 
and putting Yy, = u,—4d,, 


then, immediately after the release of the R, living individuals at time ¢, the total number of 
this class in the population will be ¥,+ y,. Then, if 


P, = the survival factor over the interval of time ¢ to t+ 1, 
we expect P(y,+ y,) of these to be alive at the time of the next trapping. Similarly, if 
N, = the total number of individuals alive in the population at the trapping at time f, 


N,—4d, will be alive at the time of release, of which P,(N,—4,) will be alive at time ¢ + 1. This 
is, of course, to adopt as before a purely deterministic model for the population. Lastly, 
we put 
B, = the dilution factor over the interval ¢ to t+ 1, so that the number of new entries 
alive at time t+ 1 is B,(N,—d,). 


Thus M1 = (P,+ By) (N,— 4) 
and A, = Nual(M—4) = P+ B, 


(A) The Microtus population 


The first important point to be decided was whether there was any evidence in these data 
to support the basic assumption which is made in the theoretical development, that marked 
and unmarked animals were captured with equal facility. It has been shown by Chitty 
(1952) that in the case of a particular Microtus population the unmarked animals were 
deficient in the samples taken during the autumn and winter of 1937-8. The effect of this 
non-random sampling was that during a period of the year when no breeding was taking 
place, it nevertheless appeared as though dilution through new entries was still occurring. 
This phenomenon was attributed to a greater degree of relative trap-shyness that persisted 
among the unmarked individuals until February. The methods of trapping, however, which 
were used at that time were not so developed as they later became, and it was hoped that 
the system of prebaiting adopted in the present series would have largely overcome any 
degree of trap-shyness. 

Before this point is examined, the biology of the Microtus population must be considered. 
The breeding season of 1948 had ended by the beginning of September. It seems likely, 
however, that a certain number of young born towards the end of the season might still 
have been growing up to enter the trappable population between September and October. 
Breeding started in April 1949, and a few young had grown up to be trapped on 20-23 May. 
No young had been caught in April. Thus, between October 1948 and April 1949, the 





he 


la- 


ies 





P. H. Lestiz, DENNIS CHITTY AND HELEN CHITTY 143 


Microtus population should have been decreasing in numbers merely owing to the death- 
rate, and any dilution factors which can be estimated during this period should therefore 
be zero. 


(i) Preliminary analysis 

A preliminary analysis of the combined male and female population was therefore first 
carried out by a method of estimation which is very similar to that described in II, § 56. 
We there considered a table of m,, values when the data for a long chain of samples are 
grouped by Method B (I, § 4), and for the purposes of rough estimation we utilized only the 
last two entries in each column of the table, and determined the successive values of P, and 
N, from a series of overlapping triangles of m,, entries. Although this method is satisfactory 
when relatively large numbers of animals are recaptured in the various classes, a slight 
difficulty was met with in applying it to some of the data analysed here, owing to the fact 
that in some tables certain of the m,, entries for x = t — 2 were zero, while one or more of the 
entries in the same column for x <t—2 still remained > 0 (e.g. the sample for October 1948 
in Table 2). Under these conditions unsatisfactory estimates of N, and P, will be obtained, 
anda modified method of estimation was therefore adopted, which is described in an appendix 
to this paper. 

In principle this method is similar to that described in Ii, §55, but at the same time it 
utilizes more of the information contained in the original table. From the latter we form 
a new table by retaining each m,_, , entry and then summing each column of m,, over the 
values of x = 0tox =t-—2. Thus, for each value of ¢ > 1, we have the pair of observed values 

t—-2 
M1, % = 2 Mma (m+ ™M_1,1 = &)- 


Then from these entries and the number of animals captured (C,) and released (R,), together 

with the value of y, = u,—d, = R,—», we first of all calculate the maximum likelihood 

estimate i: es 

y,=—t +8, (¢=0,1,2,..., 7-1). 
M141 


Then, the successive survival factors are given by 


Pr= Viral (Vit (¢ = 0,1, 2,..., 77-2), 
and N,= ya, (t= 1,2,3,...,7—1). 


Instead of this estimate of N,, which is known to be positively biased (II, §1), we may also 
use the type of adjusted estimate 


X, = HAC,+ 1)/(8+ 1), 


though we have not done so in the rough, preliminary analysis of the present set of data. 
Once these estimates of P, and N, have been calculated, which may be done very quickly, 
the values of A, follow from 

A, = Nusl(M—d), 


and hence those of B, = A,— P,by subtraction. In a preliminary analysis of this nature it 
will probably not be necessary to calculate the sampling variances of these estimates of 
P, and N,, though the expressions for doing this are also given in the appendix. 

The results of applying these methods to the data in Table 2 are presented in Table 3, 
where the estimates of y,, N,, A,, P, and B, are entered opposite the months of the year in 








144 The estimation of population parameters from capture-recapture data 


which the respective samples were taken. In reading this table, as well as others given later 
in this paper, it is to be noted that the estimates A,, P, and B, refer to the intervals between 
the samples, whereas y, and N, refer to the state of the population at the time the actual 
sample was taken. Thus, in Table 3, P, = 0-424 is the estimated survival factor over the 
interval between the June and July samplings, and similarly B, = 0-661 is the dilution 
factor between those of July and September. Although, strictly speaking, yy, and N,can only 
take integral values, it is convenient to retain at least one decimal place in order to 
distinguish figures which are estimates from those which are known, true values. 


Table 3. Estimates of population parameters for Microtus from data in Table 2 

















A A A 
Month t Wy N, AX P, B, 
1948: June 0 “= a os 0-424 — 
July 1 40-70 152-6 1-153 0-492 0-661 
Sept. 2 34-26 171°3 1-343 0-759 0-584 
Oct. 3 75:36 226-1 1-273 0-722 0-551 
Nov. 4 84-00 281-4 0-719 0-413 0-306 
1949: Mar. 5 52-87 200-2 1-221 0-856 0-365 
Apr. 6 110-38 242-1 0-927 0-626 0-301 
May 7 109-21 220-6 0-646 0-444 0-202 
June 8 66-30 136-1 1-077 0-530 0-547 
July 9 54-18 120-7 1-495 0-817 0-678 
Sept. 10 85-93 176-0 1-080 0-543 0-537 
Nov. ll 82-00 190-0 -— _- — 




















For our present purposes, it is the estimates of B, given in Table 3 which are important. 
It will be seen that the individual figures remain positive throughout the chain, whereas 
we should have expected to obtain, between October 1948 and April 1949, estimates of 
B;, B, and B, which, on the average, were zero. That these estimates remain consistently 
positive is sufficient to suggest that our assumed random sampling of both marked and 
unmarked animals in the population was not in fact fulfilled, and it is therefore necessary to 
examine the results for the winter months in greater detail. 

A method of testing for the absence of dilution in the population sampled has been given 
in a previous paper (II, §5a). The method was there described in terms of a population 
which was decreasing over the whole sampling chain through the operation of a variable 
death-emigration rate, this rate being assumed to fall equally on all subclasses of marked 
and unmarked individuals in the population. It was also assumed that no deaths were 
caused by the method of sampling. In the present example there are a few accidental 
deaths, and the hypothesis that there was no dilution applies to a period in the middle of 
a chain of samples. These complications cause little difficulty, however, provided that the 
number of accidental deaths is small, and that we tabulate the number of recaptures of all 
individuals marked before the date at which the population is assumed +o decrease. 

We will assume that no dilution was occurring in the Microtus population from October 
1948 until the next summer, and that all living individuals in the population were exposed 
equally to the risk of capture. In the sample taken in May 1949, it was possible to dis- 
tinguish without any difficulty the young of the year, born during April and early May, 
from the adults which had formed part of the overwintering population. The May sample 





dle 





P. H. Lestre, Dennis Cutrty AND HELEN CHITTY 145 


was therefore considered only in terms of these adults. The recaptures in the samples for 
October, November, March, April and May were then grouped according to the time they 
were first captured and marked, and males and females were treated separately. The data 
for the two sexes are given in Table 4. 


Table 4. Distribution of Microtus recaptures October 1948-May 1949 according to 
the month they were first marked and released (adult animals only in May) 
























































Month of capture 
Month when pee Total 
first marked Oct. Nov. March April May recaptures 
; 
Males 
June—September 14 8 7 + 2 35 
October a= 5! 4 2 1 12 
November — = 3 2 4 9 
March —- -- —- 14! 7 21 
April — — — — 12 12 
No. unmarked 254 291 38 34? 21 
Total catch 394 422 52! 568 47 
| 
Females 

| 
June—September 9 6 | 5 8 1 29 
October oo 1 5 3 3 12 
November — — 4 7 3 14 
March — _— — 17 7 24 
April aes sae | ne oe 9 9 
No. unmarked 21! 18} 40! 341 16 

SUSIE hides Pc OM = BM Packs ‘i 
| | 
| Total catch 30! 255 | 54 | 691 39 
{ L 




















Now, if our assumptions are correct for the type of population we are dealing with, we 
should expect that the number of marked animals recaptured from each batch of releases 
would form a constant proportion of the total catch in each sample (II, §5a). Thus, neg- 
lecting any effects from the small number of accidental deaths, we may calculate the 
expected number of recaptures for each cell of the table from the mean proportion re- 
captured in each row. For example, a total of 35 males, originally marked between June and 
September, was recaptured in the five samples consisting of a total of 236 animals, a mean 
proportion of 0-1483. The ‘expected’ number of this class caught in October is therefore 
0-1483 x 39= 5-78, compared with the observed number of 14. In a similar fashion, the 
remaining expected numbers of recaptures can be calculated, the expected number of 
unmarked animals being obtained finally by subtraction. The observed and expected 
frequencies may then be used to calculate y? in the usual way, the number of degrees of 
freedom being }c(c—1) for a table consisting of c columns. Since, however, the expected 

Biometrika 40 10 








146 The estimation of population parameters from capture-recapture data 


numbers for the individual cells of the present tables are small, we have compared the totals 
for marked and unmarked animals. The following were the results for the two sexes, the 
expected numbers in each case being given in brackets after the observed frequencies: 














Males Females 
Sample 
taken in 
Marked Unmarked Total Marked Unmarked Total 
1948: Oct. 14 (5-78) 25 (33-22) 39 9 (4-01) 21 (25-99) 30 
Nov. 13 (8-79) 29 (33-21) 42 7 (4-95) 18 (20-05) 25 
1949: Mar. 14 (13-90) 38 (38-10) 52 14 (15-35) 40 (38-65) 54 
Apr. 22 (26-38) 34 (29-62) 56 35 (34-95) 34 (34-05) 69 
May 26 (34-14) 21 (12-86) 47 23 (28-75) 16 (10-25) 39 


























For the males y? = 24-7, and for the females y? = 12-8; and since there are 4 degrees of freedom 
in each case, P< 0-001 and < 0-02, > 0-01, respectively. Overall, therefore, there is a signi- 
ficant departure from expectation, and in both sexes the deviations show that in the earlier 
samples more marked animals were being captured than expected. 

These results, together with the non-zero values of the winter dilution factors B, which 
have been calculated earlier, show that an apparent dilution of the Microtus population 
had been taking place during the winter of 1948-9, at a time of the year when no new 
individuals should have been entering the population at risk of capture. Judging from our 
previous experience of trapping this species, the most probable explanation of this pheno- 
menon is that marked and unmarked animals were not being caught with equal facility, 
though the reasons for this difference in behaviour are at present obscure. For our present 
purposes, however, the most important conclusion to be drawn from the presence of this 
apparent dilution is that the assumptions made in the mathematical model are not valid 
for this population. If this phenomenon was present during the winter months, we cannot 
exclude the possibility that it may also have been present during the breeding season, when 
true dilution of the population must have been occurring. However, there appears to be no 
obvious way of testing whether or not this was the case. In these circumstances we must 
abandon any idea of estimating the total number of Microtus on the area by any of the 
methods which have been previously given, and which are based on the assumption of a 
purely random sampling of all classes in the population. 


(ii) Data for the marked individuals 


The results so far obtained from the analysis of these data for Microtus raise a further 
important question. If the chances of capturing a marked or an unmarked individual are 
not the same, is this difference necessarily confined to these two broad categories? We can 
conceive of the interaction between a population of animals and a population of traps in 
terms of some kind of ‘learning curve’. The individuals might become more and more 
accustomed to the presence of these unnatural objects and to the actual experience of 
being trapped. The chance of capturing a marked animal might therefore depend on the 
number of times it had previously been trapped. If this were so, then it would be impossible 











P. H. Lestiz, Dennis CHITTY AND HELEN CHITTY 147 


to apply any methods of estimation based on the assumption that the individuals were 
being randomly sampled iike counters in a drum. 

The first way in which we attempted to answer this question was by considering separately 
all those individuals which were captured, marked for the first time, and released alive 
between September 1948 and May 1949, only adults being included in this last month. This 
group of releases was chosen merely because it seemed likely that they would have a more 
homogeneous age constitution than that of groups released during the breeding season. 
The animals falling into this group, which were released at each of the trappings in October, 
November, March, April and May, were then tabulated according to the number of times 
they had been trapped. Since two trappings were carried out in both March and April, an 
individual animal could have been released in, for instance, March, having been trapped 
twice in that month, apart from its previous history. 

Then, if a total of r, animals are released at time t, of which a; have been trapped i times 
(¢ = 1,2,3,...; a a; = r,), and if a total of c,,, of these are recaptured at time t+ 1, we should 


expect, under a hypothesis of a purely random sampling, to find that a;c,,,/r, would be 
recovered out of the a; originally released, provided that there was no differential mortality 
between the various classes. The numbers of individuals released each month are given in 


Table 5. The distribution of Microtus released according to the number of times they had been 
trapped, and the number of these recaptured at the next trapping of the area. ($¢ and 92 
combined : adult animals only in May) 

















No. of - Re- Re- Re- Re- 
times || leased | CAU8H* | jeased | CArsht | teased | Caught | teased | Causht | teased | Caught 
trapped || Oct. aga Nov. ; Mar. Pr. Apr. ad May md 
™ tei iat 
1 41 6 (6-1) 45 7 (8-4) 63 24 (28-6) 57 20 (16-3) 35 14 (13-3) 
2 13 2 (1-9) 12 8 (2-2) 25 16 (11-4) 33 6 (9-4) 29 13 (11-1) 
3 wi at) 2 1 (0-4) 9 5 (41) 16 5 (4:6) 5 1 (1-9) 
4 —_ — — —_ 1 0 (0-4) ll 3 (3-1) 5 1 (1-9) 
5 cae en pa pi 1 0 (0-4) 2 0 (0-6) 2 0 (0-8) 
Total 54 8 (8-0) 59 | 11(11-0)| 99 | 45 (44-9) | 119 34 (34-0) 76 29 (29-0) 



































Expected number of recaptures given in brackets. 


Table 5, together with the number of these recaptured at the next trapping of the area, the 
expected numbers of the latter, based on the distribution at the time of release, being given 
in brackets. Comparing the observed and expected frequencies, it can be clearly seen that 
there was no obvious tendency for animals which had been trapped a number of times to 
be caught more easily at the next retrapping. It is to be noted, however, that the minimum 
interval between the trappings is 28 days, and it does not necessarily follow that we should 
have obtained the same result if the intervals had been very much shorter. 

This evidence, which is satisfactory so far as it goes, is hardly sufficient, since we are 
utilizing only a portion of the data over a limited period of time. There is another method 
of approach, which is of much greater interest. Suppose we neglect the number of animals 
which were captured unmarked, namely, the w, figures in Table 2, and confine our attention 
to the marked animals which were captured at each of twelve trappings between July 1948 
and April 1950, inclusive. These represent a chain of twelve random samples from the total 


10-2 














148 The estimation of population parameters from capture-recapture data 


marked population on the area, and we will now regard this subpopulation as being the 
only one with which we are concerned. Then, from the information yielded by these samples, 
we can, in theory, estimate the total number of individuals in this population, and the 
survival and dilution factors within it for each interval of time. With reference to this sub- 
population we can use exactly the same symbolism as before, the total number of individuals 
at time ¢ being N, from which a sample of C, has been withdrawn. Similarly, in subdividing 
the sample of (, into the various categories, all those animals which had been caught only 
once previously are regarded as new members (corresponding to the unmarked (u,) class in 


Table 6. Distribution of Microtus re-recaptures according to the interval since they were 
last recaptured (m,,). Marked population only ; 33 and 99 combined 



























































Month of capture (¢) 
Month when 
last captured (zx) 

July | Sept. | Oct | Nov. | Mar. | Apr. | May | June | July | Sept. | Nov. | Apr. 

1948: July _ 3 T= a ro oes pews eed — mis sine os 

Sept. _- — | 6 1 i con same — aes a Lae, ee 

Oct. —_ 2 Se Sane 5 2 es art on aes Lak nad sao 

Nov — ae ere ae 7 2 one 1 — an ~ — 

1949: Mar. — — |—]— — 15 31 1 — _ 1 = 

Apr. _ _ | — _ —_ —_— 13? 9 _ _ _ —< 

May ont — | cae aan a ae ice 165 2 on a. a 

June — —_— | — — —_ at — ie 222 — 1 — 

July _— or _ — — — — 23 4 — 

Sept aie —j|}-— = pau ai ies aes = es 291 ae 

Nov —_ — | =e ace — es Px was oni saat aN 9 
= aS E. 

8 — 3 | 6 6 9 17 163 | 275 | 242 | 23 353 9 

uy 12 | 14 | 17 144} 19 | 40: | 33% | 30° | 20: | 39 471 | 10 
on | ——-|-— aaa Bi) Bae nre: SARE Tea ERR! Aes tk TER ER 

Total catch (C,) 12 | 17 | 23 | 20 | 28 57 | 49 57 44 62 82 19 
Total released (R,) =} ae 23 | 19 | 28 | 56 ; 48 44 41 | 62 80 18 
L i { L i | L 











the population as a whole), while all those which had been captured at least twice previously 
are regarded as old members (corresponding to the marked class (s,) in the population as a 
whole). These old members can then be separated into the various m,, classes, which now 
represent the re-recaptures at time ¢, which were last recaptured at time x. Once the prin- 
ciple has been grasped, the methods of assembling the data are simple. The occasion on which 
an individual! animal was first ringed is no longer of any interest and only re-recaptures 
need be considered. By disregarding all animals which were not recaptured at least once, 
the amount of data is, of course, seriously reduced, and in the present instance we have 
therefore pooled the data for males and females, the resulting distributions being given in 
Table 6. 

Among the various parameters which we may estimate from these data for the marked 
animals there is one, closely connected with the dilution factor B, for this subpopulation, 


which is of particular interest, since the correct answer is known. This parameter we will 
define as 


Z = the number of new additions to the marked population which were made at the 
sampling at time ¢. 








> 
Ss 
rd 


el iTITITttt | 





P. H. Lestrz, Dennis CuitTty AND HELEN CHITTY 149 


Now. we know the actual number of animals which were marked for the first time and 
released at each trapping, and we can therefore compave the estimates of Z,, which are 
made from the distribution of the re-recaptures, with the true values. Any marked dis- 
crepancy between these figures, when compared with the standard errors of the estimates, 
would indicate that the sampling of the marked population was not at random, in which 
case no valid estimates of any population parameter for Microtus could be made from this 
particular set of capture-recapture data. 


The estimates of Z, may be very quickly obtained from the m,, values given in Table 6. 
As a first step, we obtain, as before, 


t-2 
uy = =m (1% + ™My_1,4 = &). 
z= . 


Then, from the total number of animals captured in the sample (C;), the number released 
(R,), and the number of accidental deaths or removals (d,), together with the additional 
figure y, = R,—,, we calculate first of all 


A the a Eh 
VW, = Hh +6, 
™, 141 
m= 3/C, U=1-PMy 


and from these, as shown in the appendix to this paper, 





A A 


Zi = (Vat W)[Pur—Vilmt+d, (t= 1,2,3,..., 7-1), 
and, for the variance of this estimate, 


A ah 1. 2( ] a 2n i 2 
ris = Fi) [Pe eases 
Purr St41 Pt M141 Pr} 8 
The results for the Microtus data are presented in Table 7, where the origin of the arbitrary 
time scale is now taken to be July 1948. The first estimate of Z, which can be calculated is 


for September 1948, the last for November 1949. The estimated total number of Microtus 
marked for the first time and released into the population over this period is 609-4, whereas 


Table 7. Estimates, from the data in Table 6, of the number of Microtus which were 
marked for the first time and released at each trapping (Z,) 























a 2 True no. | Deviation | Standard 
Month t Ve a released A | error of Z, 
if 
1948: Sept. 1 3-00 48-2 65 — 16-8 + 13-5 
Oct. 2 10-60 51-4 41 +104 +19-1 
Nov. 3 11-42 38-9 45 an) Gel | + 13-6 
1949: Mar. 4 12-73 66-8 76 — 9-2 +134 
Apr. 5 29-92 111-8 65 + 46-8 + 26-5 
May 6 45-56 19-6 46 — 26-4 + 20-7 
June 7 31-00 35-5 48 — 12-5 + 84 
July 8 24-00 69-5 54 + 15:5 + 10-6 
Sept. 9 35-83 78-8 65 + 13-8 + 15-3 
Nov. 10 35-00 88-9 108 —19-1 + 233 
Total 609-4 613 
































150 The estimation of population parameters from capture-recapture data 


the actual number released was 613; a remarkably good agreement. Comparing the in- 
dividual Z, with the true values, it can be seen that on the whole they follow the trend of 
the latter reasonably well, the worst discrepancies being in the months of April and May 
1949. In regard to the errors of these estimates, there is one point to be noted. The expression 
given above for V(Z,) is derived from a set of maximum-likelihood equations which are 
based on the assumption that the probability of obtaining the observed results at each 
sampling is given by the appropriate term of a multinomial distribution. In actual fact, 
we are drawing samples from a finite population, and in using this expression, therefore, 
we may tend to overestimate the standard errors of Z,, if the proportion of the population 
sampled each time was at all large. 

Now, the true proportion sampled each time, namely, f, = C;/N,, is unknown and can only 
be estimated from the data. Thus we have s,/y, as an estimate of f,, and from this it will be 
seen that no estimate can be made of f, for the last sample in the chain, upon which the 
estimate 5... , is in part based. Moreover, if n,,, = 0, which, it will be seen from Table 6, 
occurs on three occasions in the present series, then y, = s,, and it would appear that the 
entire population at risk was sampled at time ¢, a conclusion which seems somewhat unlikely, 
though not impossible, in data of this type. The simplest way of proceeding is therefore to 
obtain a pooled estimate of f, from the available y,, and apply the average correction 1 —f, 
to each of the individual variances. Thus, from Tables 6 and 7 we have Xs,/Zy, = 0-694 and 
1 —f, = 0-306, from which we have the adjusted standard errors of Z, given in Table 7. 
A comparison of these with the deviations of Z, from the true values suggests that the latter 
are of much the order we might expect. Actually Es*(Z,) = 3007, compared with ZA? = 4351, 
and considering the approximate nature of the correction term employed, this seems a 
sufficiently good agreement.* Thus, by treating the marked Microtus as a population from 
which we were drawing a series of samples, we have obtained estimates of a parameter that 
are sufficiently close to the known, true values for us to conclude that the behaviour of this 
part of the total population could be regarded as satisfactory, since there was no evidence 
of any marked departure from the assumption of a purely random sampling of these 
individuals. 

The results of this analysis of the data for the Microtus population on this particular 
area can be summarized as follows: 

(1) There was a difference in the behaviour of marked and unmarked animals. A portion 
of the population was being trapped less easily than the remainder, at any rate during the 
winter months, and we cannot exclude the possibility that the same phenomenon was also 
present during the remainder of the year. This being so, no estimates can be made of the 
total Microtus population on the area by methods of estimation which are based on the 
assumption of a random sampling of marked and unmarked animals. 

(2) Once trapped, however, the Microtus population appeared to behave satisfactorily 
from our present point of view. The samples obtained from the marked population yielded 
results which were very much what we might have expected if we had been randomly 
sampling counters from a drum, instead of trappir » living animals. The data for marked 


* It is possible that this estimate of f, = 0-694 is too large, owing to the occurrence of three zero values 
of n,,, in this series. If we were to neglect the three corresponding values of yy, and s,, we should have 
J, = 104/177 = 0-587, and the adjusted Es%(Z, ) = 4059. The latter value of f, agrees very well with the 
estimate of f, = 0-567 which is made in the next section and which is based on the recaptures of all 
marked individuals. The present estimates are based on the recaptures of those individuals in the 
subpopulation of marked animals which were marked at least twice. 








P. H. Lestrz, Dennis Cuttty AND HELEN CHITTY 151 


Microtus may therefore be used to estimate ‘such parameters as the death-rate at different 
seasons of the year, or the expectation of further life of the individuals released at some 
given date. Such estimates, however, will be applicable, strictly speaking, only to the 
trappable portion of the Microtus population. 


(iii) Methods of calculating the death-rate and of constructing life-tables 


As an example of a simple method of estimating the death-rate per unit of time in the 
population as a whole, we may use the estimates of the survival factors P; given in Table 3. 
These estimates are based essentially on the numbers in the various classes of marked 
individuals which were recaptured from the known number of animals released (e.g. IT, 
§§ 1 and 5), and no use is made of the observed proportion of unmarked animals occurring 
in the sample of C, taken at time ¢. This last point is of importance since, if this was not the 
case, the estimates of F, would be just as untrustworthy as the estimates of total numbers 
for this particular Microtus population. The values of B given in Table 3 refer, however, to 
varying intervals of time, and for comparative purposes, therefore, it is necessary to express 
them in terms of some common unit. If we define 


/4, = the force of mortality between ¢ and t+ dt, which we assume to remain constant 
over the interval ¢t to t+w, we have 


P= em, 
Then Mm = —log, P,/w, 
with Vim) = V(P,)|(w*P?), 


the expression for calculating V(P,) being given in the appendix to this paper. These vari- 
ances may, however, require correction in order to allow for the finite nature of the population 
sampled, particularly if a relatively large proportion of the marked population was being 
captured each time. This proportion can only be estimated from the data, and, as before, 
we have employed the estimated average proportion f, sampled over the whole period. 
Thus, in Table 3, the values of v; are estimates of the total number of marked Microtus in 
the population at time ¢. The sum of the available values of Vi is 795, and from Table 2 the 
sum of corresponding s, is 451. Hence, f, = 0-567, and we have corrected the estimated 
variances of 4, by multiplying them by the factor 0-433. The following were the values of 
/4 expressed in a unit of time of 28 days, together with their standard errors: 























Death-rate 
Month mts 
1948: June— 0-572 + 0-152 
July- 0-545 + 0-174 
Sept.— 0-184 + 0-143 
Oct.— 0-217 + 0-154 
Nov.- 0-216 + 0-039 
1949: Mar.— 0-129 + 0-089 
Apr.- 0-468 + 0-116 
May- 0-812 + 0-087 
June- 0-577 + 0-056 
July— 0-135 + 0-041 
Sept.—Nov. 0-218+0-011 

















152 The estimation of population parameters from capture-recapture data 


The death-rate per 28 days in the marked Microtus population, taking both sexes to- 
gether, was high during the summer of 1948. From September onwards it fell and appeared 
to remain approximately constant over the winter months until April 1949. Thereafter 
there was a rise in the rate until July, and between September and November 1949 it was 
again of very much the same order as in the previous year. Although no estimate of the 
death-rate can be made for the period subsequent to November 1949, the rate must have 
increased very greatly during the second winter, since the Microtus population was 
evidently reduced to very small numbers by April 1950. 

The relative constancy of the death-rate during the winter of 1948-9 is of some interest, 
since the population is of a more homogeneous age constitution at this season of the year 
than at any other, and neither sex is in breeding condition. In order to see whether there 
was any difference between the sexes during this time, we also estimated the average death- 
rate between October 1948 and April 1949 by the methods of estimation which have been 
previously given (1, §5B and Appendix 1). The equations given there are expressed in terms 
of a constant survival factor P and an equal interval between the samplings. It is, however, 
easy to adapt them to the case of unequal intervals, and although the equations to be 
solved are somewhat more complicated, the actual method of solution remains, in principle, 
the same. 

in order to estimate the constant survival factor P per unit of time over the period 
October 1948 to April 1949, we make use of the number of animals released at and between 
these dates, and of the recaptures of marked animals in the samples taken between Novem- 
ber 1948 and May 1949 inclusive, due allowance being made for any accidental losses or 
removals (I, Appendix 2), and only adult animals being considered in May. The following 
were the estimates of the constant survival factor P per unit of 28 days and the values of 
the death-rate obtained from them. No correction has been made of the standard errors of 
these estimates to allow for the finite nature of population sampled: 














B | 
Survival factor | Death-rate 
Pts | wis 
| - | ea é 
3d 0-725 + 0-0288 0-322 + 0-040 
29 0-867 + 0-0289 0-143 + 0-033 
Combined sexes 0-817 + 0-0182 0-202 + 0-022 
| pt a toe ies ne eee 





Evidently there was quite a marked difference in the average death-rates of the two 
sexes. The difference between them is 0-179 + 0-052, which would be reckoned as significant, 
quite apart from the fact that we are probably overestimating the individual variances. 
The average death-rate for the combined sexes is “ = 0-202, while the average obtained by 
weighting the three individual 4, values in the previous table for October, November and 
March by the reciprocal of their variances is 7, = 0-203. 

The death-rate in the population as a whole is, however, not the most satisfactory way 
of investigating the changing mortality experience of a species such as Microtus, which, 
owing to the intermitténce of its breeding season, must suffer a series of somewhat violent 
and complicated changes in the age constitution of the population during the course of the 
year. Thus, from the figures given earlier, it is impossible to decide whether the increase in 














P. H. Lestiz, Dennis CotttTy AND HELEN CHITTY 153 


the death-rate which occurred in the summer months of 1949 was due to an increase in the 

rate at which the overwintering adults were dying off, or whether it was due to a high mor- 

tality among young born during the early part of the breeding season. Indeed, both of 

those factors might have been operating, though in different degrees. A more satisfactory 

way of approaching the problem would be to construct life-tables for each batch of newly 

ringed animals which were released into the population and, if the data permitted, to 

subdivide each batch into males and females, and also into young and old. A detailed 

subdivision of this nature would, however, require much larger figures than those in the 

present series. But, as an example of the methods of estimating these life-tables, we shall 

compare the mortality experience of the individuals released for the first time in September, 

October and November 1948 with that of the releases in March, April and May 1949, 

considering only adult animals in this last month. These animals which were trapped for 

the first time in the spring of 1949 must have been alive and of trappable age, certainly in 

October 1948, and probably also in September. They were, however, not caught then, and 
in view of the phenomenon of non-random sampling during the winter months to which we 

have already referred, it was of interest to see whether their expectation of further life, 

once they were caught, differed appreciably from that of the survivors of the groups which 
entered the traps during the previous autumn. 

The various methods of estimating the number of survivors at successive intervals of 
time from a known number of individuals released at some given date have already been 
described (II, § 2). If G, individuals form the group released at a date d, then any recaptures 
of this group in the samples taken from the general population can be recognized by the 
presence of the mark d. These recaptures can be tabulated in various ways, and in the 
present series of data we assembled them in the form of the frequencies m,,,, the number 
recaptured at time t which were last captured at time x (d<x<t). We then obtained esti- 
mates of the number of survivors (N,) of the original group by the approximate methods 
described in II, § 2. There is, however, one further complication in the present data, and 
that is the occurrence of accidental deaths and removals from the population at certain 
trappings. The simplest method of dealing with these is best illustrated by an actual 
example. 

In March 1949, 76 Microtus (3733, 3992) were marked for the first time and released. 
Of these, one individual was found dead in a trap in April, and five were trapped in June 
and removed from the population. The life histories of these six animals out of the original 
76 are therefore known, and we can immediately write down the corresponding life-table 
(l;,) figures for them. Thus, six were alive at the time of the trapping in April, five in May 
and five in June, and none after that date. The remaining records for this group are repre- 
sented by the following table (see p. 154) of m,, values, in which we arbitrarily label the 
trappings from April onwards as being taken at ¢ = 0,1, 2,.... 

It will be seen that, for instance, out of the remaining 70 animals released in March 
(having eliminated the six accidental deaths and removals), seven were caught in June, of 
which two were last caught in May and four in April, the remaining individual in the sample 
having escaped capture since March. The last member of the group was trapped in 
November, and hence the last estimate which can be made of the number of survivors is 
that for September. Since no members of the group were caught in April 1950, the number 
of survivors in November is indeterminate; but we do know that at least one individual was 
alive then, and we shall adopt the convention both in this, and in all other such tables, that 














154 The estimation of population parameters from capture-recapture data 


the survivorship curve ends at the last date on which any member of the original group was 
recaptured. The method of computing the estimates of the number of survivors at times 
x = 0,1,2,...,7’—1 from such a table has already been illustrated on a numerical example 
































ba 
; t Apr. | May | June | July | Sept. | Nov r - 
te 0 2 3 4 5 Se. i 8. tee el ae 
ee t zt 
Apr. 0 -- 6 4 a -- — 15-86 30 25 46:3 6 52-3 
May 1 _ —_— 2 1 — — 5-25 ll 14 26-4 5 31-4 
June 2 —_ — _ 3 —_ — 5-25 7 7 9-0 5 14-0 
July 3 — — —_ aa 2 1 3-00 4 3 4-0 0 4-0 
Sept. 4 — “ _— — — 0 0 2 1 4-0 0 4-0 
Nov. 5 _ — —_ -— — _ -- 1 — 1 0 1 
Total caught and 30 11 7 4 2 1 
released (R,) 

2 —_ 1-4773| 1-75 1-75 1-00 1-00 
































in an earlier paper, the first step being to build up the successive z, and =k‘, values, starting 
at the bottom right-hand corner. (For a definition of these symbols, and an explanation of 
the way in which they arise, reference should be made to II, § 2b.) Then 


we 7. T 
: R,( ¥ R,+ 1) > ligt 1) +1 
r+1 


t=r+1 
Thus, from the table above, we have, as a first step, 
NG = 30 x 26/16-86 = 46-3, 


and since from the survivorship curve of the six accidental deaths and removals, /, = 6, we 
finally have N, = 52-3 as an estimate of the number of survivors out of the 76 released in 
March which were alive at the time of the trapping in April. Having thus obtained the series 
of N,, ending with the number trapped at the last date on which any of the group were 
recaptured, the construction of the life-table follows in the customary way. Thus, when no 
accidental deaths and removals are present, we merely express the ¥; in terms of 100 or 
1000 individuals alive at the original date of release. When this is not the case, as in the 
present example, it is necessary to calculate the series of P, figures, after allowing for the 
accidental deaths and removals which take place at each ordinate x. In June, for example, 
14-0 individuals were estimated to have been alive at the time of the trapping, five individuals 
were removed from the population, and therefore, since 4-0 were estimated to have been 
alive in July, the survival factor over the interval June to July is P, = 4-0/(14-0—5) = 0-4. 
Then, the values of the life-table function follow from 1, = P,P, P,... P,_1- 

In order to combine the results for a number of groups of releases, we may pool the 
estimated Dk, and observed R, and ZR, in each table. Thus, for example, from the separate 
tables for the original data, which are not presented here, we had for the trapping which 
took place in March 1949: 














, we 
d in 
ries 
yere 
1 no 
)or 
the 
the 
ple, 
lals 
een 


D4, 


the 
ate 
ich 





P. H. Lesxtrz, Dennis Cutttry anr HELEN Cutty 155. 











Original 
released in 2 kee Re ZR, le 
1948: Sept. 9-25 7 19 1 
Oct. 6-25 8 10 4 

Nov. 5-71 5 13 0 
Total 21-21 20 42 5 























From which we have, from the ‘total’ line, 
~ 20x 43 
22-21 
as an estimate of the total number of these releases surviving in March. Proceeding in this 


way, we finally obtained the following life-tables (J) and expectations of further life, in 
weeks, of individuals alive at the given dates (e,): 


+ 5]= 43-7, 

















Individuals marked for the first time and released in 
Unmarked | 
Month Sept., Oct. and Nov. Mar., Apr. and May animals | 
captured 
(u4) 
| €, (weeks) l, €, (weeks) 
1948: Oct. 1-0000 12-9 _ os 46 
Nov. 0-4793 17-6 _ — 47 
1949: Mar. 0-2231 12-0 1-0000 9-9 78 
Apr. 0-2001 8-3 0-6882 8-5 68 
May 0-1582 6-0 0-4781 7:3 37 
June 0-0639 — 0-2081 — — 
July 0-0372 _ 0-1005 sinh sade 
Sept. 0-0149 — 0-1225 a — 
Nov. 0-0074 — 0-0184 — — 


























It is evident from the figures available that the expectation of further life of the over- 
wintering individuals which were caught for the first time in March, Apri! and May differed 
very little from that of the survivors of those originally captured in September, October and 
November. This being so, we might assume that their previous mortality experience in the 
unmarked state was probably much the same as in the case of the autumn releases. There is, 
of course, no way of knowing whether or not this was true. If the sampling of marked and 
unmarked Microtus had been satisfactory during the winter months, we should have had 
no hesitation in making this assumption; indeed, it is implicit in all applications of the 
capture-recapture method. But, in the present example, we cannot be certain that a 
difference in the behaviour of these two classes was not associated with a difference in 
mortality experience. If we assume, however, that there was no difference in the mortality 
rates of marked and unmarked animals, we may obtain a rough estimate of the size of the 











156 The estimation of population parameters from capture-recapture data 


Microtus population on the area in October 1948 by applying the 1, figures for the autumn 
releases to the number of unmarked ‘adult animals which were captured up to May 1949. 
These figures are given in the last column of the table, and by dividing each of them by the 
appropriate /, entry and summing, we obtain a total of just over 1000 animals alive in 
October. If our 1, figures were exact, which they are not, this would be an underestimate 
of the true figure, since a number of overwintering adults were still being captured unmarked 
in June, and possibly also in July, though we have been unable to separate these individuals 
with any certainty from the young. Moreover, in the population at October 1948, there 
were also the survivors of those marked at the trappings previous to this date, which we 
estimated in Table 3 to be around 75 individuals. Such estimates as we are indulging in here 
can only be approximate, but, so far as they go, they suggest that the total Microtus popula- 
tion in October 1948 was of the order of 1000 individuals, or a population density of around 
250 per acre. Comparing the former figure with the estimates of N,in Table 3, it will be seen 
how greatly we would have underestimated the probable Microtus population on the area 
in the autumn of 1948 if we had assumed without any further investigation that the 
sampling of this species was satisfactory from the theoretical point of view. 


B. The Clethrionomys population 


The average range covered in its lifetime by an individual Clethrionomys was somewhat 
greater than it was in Microtus, and the trapping area was less well defined by ecological 
barriers. Nevertheless, most Clethrionomys were retrapped within perhaps 30 yd. of their 
original point of capture, and, as is shown below, the population was not recruited except 
at the times one might have expected it to increase by breeding. 

The general biology of this species appears to be very similar to that of Microtus. Thus, 
from the examination in the laboratory of dead Clethrionomys, trapped at the same time 
on other areas at Lake Vyrnwy, there was no evidence that any breeding was taking place 
during the autumn and winter months; whereas pregnant and lactating females were 
common in the samples obtained during the summer. From October to April, therefore, 
we can assume that the population would be decreasing merely through deaths, and that, 
unless immigration were occurring, no new individuals should have been entering the 
population at risk of capture. In one minor respect, however, the data for Clethrionomys 
have been treated differently from those for Microtus. In September 1948 we marked both 
species by means of an ear punch, and while these holes remained perfectly clear in the case 
of Microtus, we have reason to suspect that some might have healed over in the case of 
Clethrionomys. In order to avoid any possibility of error through the loss of these marks 
we have therefore disregarded the September sample for this species. Thus, all those 
animals known to have been first marked in September are regarded as unmarked when 
first recaptured in later months, and the September mark is similarly neglected in all 
other cases. 


(i) Preliminary analysis 

The preliminary analysis of the Clethrionomys data is thus similar to that of Microtus. 
We first of all estimate the successive dilution factors B, for the entire series of trappings 
in order to see whether these approach zero in the winter months. The data and results are 
given in Tables 8 and 9, and from the former it will be seen that the individual catches (C,) 


juli seis fa 



















































































P. H. Lestrz, DENNIS CHITTY AND HELEN CHITTY 157 
mn of this species are not very large, but appear to follow a fairly regular annual course, being 
949, high in July and October 1948, low in the winter months,and gradually increasing during 
the the summer of 1949 to a peak in September and November. In April 1950, the number 
e in trapped was much the same as in the preceding April, a complete contrast to the results 
ate for the Microtus population (Table 2). 
ked 
1als Table 8. Distribution of Clethrionomys recaptures according to the interval 
ere since they were last captured (m,). ($3 and 29 combined) 

we /-—-—- ate 
ere : Month of capture (t) 
l Month when ees 
nua last captured (x) 
ind June | July | Oct. | Nov Mar Apr. | May | June | July | Sept. | Nov. | Apr. 
een —- ras ‘a 
rea 1948: June — 173 1 1 — — —— oe os —_— _ —? 
uly — ~~ 11 3 oe re et aia & ner at: nid 
the Oct. ae - set 14 7 6 2 es ee ee ee v8 
Nov — _ — _ 10 — 2 — 1 — — —_ 
1949: Mar. _ _ —_ —_ -— 15 1 1 — _ — _— 
Apr. —_ _ — — ~ = 10! 1 _ _— _— = 
i May ~ _ — —_ ~ _~ — 5 4 — — — 
June — — — ms a — _ — 6 a= -— — 
July — _— —_ — oo — — — — 10 = 2 
Sept _ — _ a = — — — — — 19 1 
hat Nev aes aks ae Si, md i ssl a sae aa a 17 
ical fis) Le Pi ge, bie UMD 2s lee rneer ts me Ws ICE tees 2 ales, | a a 
1eir Total marked (s,) -— 172 | 12 18 17 21 15% 7 ll 10 191 | 20 
ept Total unmarked (u,) 34} 26? 41 4 8 3 6 7 18% 361 39° 
us Total catch (C;) 34 43 | 53 22 25 24 21 14 29 46 58 28 
Adis Total released (F,) 33 38 «49 22 25 24 20 14 28 45 52 28 
ime 
ace 
ere Turning to the estimates of B, given in Table 9, it will be seen that for the period October- 
re, April, one estimate is negative, one positive, and one not very far from zero. If we consider 
at, an imaginary population of 100 individuals alive in October 1948, and subject to these 
the estimated A, = (P,+ B,) and P, values, then, according to the latter, 36-1 out of the original 
nys 100 would be alive in April 1949, and from the successive A, the total population size would 
oth 
vais Table 9. Estimates of population parameters for Clethrionomys from data in Table 8 
» of otal 
rks Month t N, A B B, 
ose ra — 
se 1948: June 0 - Yd 0-620 ~ 
all July 1 51-7 2-458 0-627 1-831 
Oct. 2 114-8 0-368 0-530 — 0-162 
Nov. 3 40-8 0-973 0-722 0-251 
1949: Mar. 4 39-7 1-000 0-943 0-057 
Apr. 5 39-7 0-811 0-639 0-172 
May 6 32-2 1-195 0-667 0-528 
Us. June 7 37-3 0-777 0-429 0-348 
ngs July 8 29-0 1-643 0-357 1-286 
& Sept. 9 46-0 1-911 0-626 1-285 
are Nov. 10 86-0 — ca — 
(C;) 



































158 The estimation of population parameters from capture-recapture data 


be 35-8; in other words, on the average the change in numbers is accounted for by the 
operation of the death-rate. (This result is in marked contrast to that for the Microtus 
population during the same period, in which only 25 out of an original 100 would be alive 
in April, whereas the total population would have increased to 114.) 

These preliminary results for Clethrionomys are promising from our present point of view, 
and it is unfortunate that no estimates of the dilution factor could be made for the second 
winter during the period November 1949 to April 1950. We may, however, calculate an 
upper limit to B, for this interval. For this purpose we will define a new quantity, 

U, = P7* B(N,— 4). 
This parameter may have very little real meaning or importance in the case of the general 
population, since it merely represents the number of new entries B,(N,—d,) alive at time t + 1 
which would have been alive at time ¢ if they had been subject to the estimated survival 
factor P,. For instance, if the new entries consisted of births during the interval, this clearly 
would be an entirely artificial figure. Mathematically, however, it is the exact equivalent 
of the parameter defined as Z, in the case of the subpopulation of marked animals, and it 
may be estimated by a similar type of equation. Thus we have, as shown in the appendix, 

0, ae Rj * Cra — M41 Or 1] : 

i M1415 

Now, no estimate can be made of P_, for the last interval in a chain of samples, but we 


can obtain N. 7-, and U,_,. Thus, from the data given in Tables 8 and 9, we have for Novem- 
ber 1949, 

Ny_,—dy_, = 860-6 = 80-0 and Up,= »f 1] = 5-6. 
Hence, under the most favourable circumstances, if Pp_, = 1, we should estimate the dilu- 
tion factor between November 1949 and April 1950 as B,_, = 0-07, a figure which is to all 
intents and purposes zero. 

As a check on this apparent absence of dilution during the winter months, we may now 
consider the recaptures of marked animals during Octeber, November, March, April and 
May (adults only), when these are tabulated according to the interval of time since they 
were first marked. Since the number of individuals was in any case small, the data for the 
two sexes were combined, with the results given in Table 10. Calculating the expected 
numbers of marked and unmarked animals as before, we have: 








Month of trapping Marked Unmarked Total 
1948: Oct. 12 (12-79) 41 (40-21) 53 
Nov. 18 (15-11) 4 ( 6-89) 22 
1949: Mar. 17 (17-53) 8 (7:47) 25 
Apr. 21 (19-50) 3 (4-50) 24 
May 15 (18-06) 6 (2-94) 21 




















Comparing the observed numbers with the expected given in brackets, there appears to 
be no very marked trend in the signs of the deviations as there was in the case of the Microtus 
data. Combining the last two classes together, owing to the small expectations in the un- 
marked class, x? = 2-27, which for 3 D.F. is a perfectly reasonable value to obtain. 






























































P. H. Lesiiz, Dennis Cuttty AND HELEN CHITTY 159 
he Table 10. Distribution of Clethrionomys recaptures October 1948—May 1949 according 
- to the month they were first marked and released. (§3 and 99° combined) 
ve 
Month of capture 
Ww, Month when Total 
nd first marked recaptures 
Oct. Nov Mar. Apr. May 
an 
June—July 12 8 4 6 5 35 
Oct. —_ 10 12 11 8 41 
Nov. _— ] 0 0 1 
ral Mar. = “= 4 1 5 
+] Apr. — — — — 1 1 
ral No. unmarked 414 4 8 3 6 
rly 
ant Total catch 534 22 25 24 21 
1 it 
ix, Thus, from this preliminary analysis, we may conclude that the sampling of the Clethri- 
onomys population was satisfactory, and that there was no evidence of any pseudo-dilution 
during the non-breeding season as in the case of Microtus. It is perhaps worth adding that 
we we also carried out an analysis of these data in which we assumed that no September marks 
m- had been lost, and reached precisely the same conclusions. 
(ii) Data for the marked individuals 
It is useful to have a further check on the applicability of the methods of estimation to 
lu- these particular data by estimating the successive values of Z, from the re-recaptures of the 
all marked population. The observed data and resulting estimates are given in Tables 11 and 12, 
sas Table 11. Distribution of Clethrionomys re-recaptures according to the interval since they 
ait were last recaptured (m,,). (Marked population only ; 33 and 99 combined) 
hey Month of capture (¢) 
the Month when 
ted last captured (z) | 
July | Oct. | Nov. | Mar. | Apr. | May | June | July | Sept. | Nov. | Apr. 
1948: July = 5 a — — — — —_ ee! ons ae 
Oct. - — 4 — 2 1 ano pa ae Be is 
Nov. — — _ 9 nia 2 =F 1 ae as: ae 
1949: Mar. — -— a a ll 1 1 — — _ — 
Apr. — — — — _ 9 1 —_ — _ — 
May — — _— — —_ — 4 3 _ — -- 
June —_ — a a as = — 4 — aa —_ 
July ~ ae _ _ — — — = 6 — 1 
Sept. eee ete mes a — a et we “ si -— 
Nov. ss — — — — —_ = mae = aa 7 
8 — 5 4 9 13 131 6 8 6 3 8 
= Uy 173 7 14 8 x 2 1 3 4 161 | 12 
otus feet ot eae 
Total catch (C,) 17 12 18 17 21 .| 15 7 11 10 19 20 
un- Total released (R,) 14 12 18 17 21 14 7 il 10 18 20 




































































160 The estimation of population parameters from capture-recapture data 


and it will be seen from the latter that between October 1948 and November 1949 a total 
of 151 bank voles were marked for the first time and released, whereas the estimated total 
is 150-8. The individual Z, follow the general trend of the true values very closely, con- 
sidering the small size of the samples on which they are based. In regard to the standard 
errors of these estimates, the same procedure was followed as in the case of the Microtus 
data. The sum of the available y, is 96, and the sum of the corresponding s, = 67. Hence 
j, = 0-698, and the variances of Z, were then multiplied by the factor 0-302 to give the 


Table 12. Estimates, from the data in Table 11, of the number of Clethrionomys which 
were marked for the first time and released at each trapping (Z;,) 




















| 
A 4 | True no. Deviation Standard 
Month Wy Z: edinaatd A Rs oth 2, 
1948: Oct. 5-00 42-0 37 + 50 +13-3 
Nov. 4-00 16-0 4 +12-0 + 61 
1949: Mar. 12-09 9-6 8 + 1-6 + 42 
Apr. 22-33 -—11 3 — 41 + 42 
May 20-00 2-4 6 — 36 + 2-4 
June 13-00 41 7 — 29 + 2-4 
| July 8-00 7:3 17 — 9-7 + 2:8 | 
| Sept. 6-00 53-3 35 + 18-3 +18°5 | 
Nov. 5-57 17-2 34 — 16-8 +141 | 
Total 150-8 151 


























individual standard errors in the last column of Table 12. Comparing these with the devia- 
tions of the estimates from the true values, it seems that on the whole the latter are of much 
the order we might expect from errors of sampling. The chief exceptions are the estimates 
for November 1948 and July 1949, for which the deviations are roughly twice and three 
times the respective standard errors. An unknown margin of error in the latter must, 
however, be allowed for, owing to the approximate nature of the correction employed. 
Overall, the sum of the squares of the deviations from the true values is 921, while 
=9%(Z,) = 810. Generally speaking, therefore, there is no evidence of any marked dis- 
crepancy in these results, and if this had been a problem of estimating an unknown para- 
meter Z,, given only the capture-recapture data in Table 11, we should not have been led 
into any very serious error either as to the magnitude of this parameter or the changes it 
underwent during the period of observation. 


(iii) Estimation of total numbers 


The results of all the tests which we have been able to apply to these data suggest that the 
sampling of this species was satisfactory from the theoretical point of view. There was no 
evidence of any differential sampling of marked and unmarked animals, and the estimates 
of the numbers marked for the first time and released at each trapping were in good agree- 
ment with the true values. We may therefore apply to these data, with some degree of 


confidence, the methods which have previously been described for estimating the total 
number of individuals in the population. 














P. H. Leste, Dennis CuitTy AND HELEN CuITTY 161 


Since we are here dealing with a relatively long chain of samples taken over a period of 
very nearly two years, during which time both the total numbers and the death-rate may 
have been varying quite appreciably, the appropriate methods of estimation are in principle 
similar to those which were briefly discussed in II, §4. Thus, we may take each group of 
G,, animals which were caught unmarked and released marked at time x (x = 0, 1, 2,3,...) 
and estimate the number of these surviving at time ¢ (t = 7+ 1,2+2,7+3,...) by one or 
another of the methods given in II, § 2. Then, if G., is the estimated number which were 
surviving at time t, with variance V(G,,), the total number of marked individuals in the 
population at time ¢ is clearly 


y= = as 
t-1 
and Viv) = ZV (Ga), 


since the separate groups of animals are independent. We may then calculate the total 
number of individuals in the population by means of the maximum-likelihood estimate 


N, 9 WC,/s, (¢ sii 1, 2, 3, wey d= 1); 
or, preferably perhaps, by the adjusted estimate 
N= WC,+ Vi(6,+ 1), 


with variance VN) = 


Marmara wr 


In order to carry out satisfactorily all the various steps in this computation, however, it 
would be necessary to have a very much larger body of data than in the present example. 
The number of Clethrionomys which were ringed for the first time and released at each 
sampling was small in a number of cases, as may be seen from the unmarked (u,) figures 
given in Table 8. We therefore obtained a pooled estimate of the number of marked animals 
alive in the population in a similar way to that described for the construction of the life- 
tables for Microtus. 

Each group of releases was considered separately, and having first of all found the life- 
table 1, values for any individuals in the group which were accidentally killed, the re- 
recaptures of the remaining individuals were tabulated according to the interval of time 


L 
since they were last recaptured. The values of > k’,, R, and > R, were then computed from 
t z+1 


these entries, as illustrated above in § A (iii), by the methods given in II, §2b. For a par- 
ticular month, all the available values of these three quantities in the separate tables were 


then pooled, as well as any J), figures. Then, an estimate of the total number of marked 
animals in the population at time ¢ is given by 


3 
J, = SR (85 z R+ 1) /(s ED kyt 1) +I, 
t=z+1 
the variance of this estimate (II, § 26) being 
SIR, ~ SXk, 


v by 2 a 4 
MH) = Vl (SSR, +1) (SEK +2)” 
Biometrika 40 II 














162 The estimation of population parameters from capture-recapture data 


where in both expressions the symbol S is used to indicate the pooled values. The estimates 
of N,and VN, ) then follow from the equations given above. The following results were 
obtained: 








v vv 
Mcnth Sik, SR, | SIR, | Sl; Write 8: C; Nt8 
1948: July 9-17 14 12 3 20-9+ 1-5 17 43 51-1+ 6-9 
Oct. 16-67 12 27 0 19:0+ 1-6 12 53 78-9 + 12-9 
Nov. 30°36 18 60 1 36-0 + 2-6 18 22 436+ 4-1 
1949: Mar. 24°51 17 44 1 31:0 + 2-3 17 25 448+ 49 
Apr. 19-10 20 34 1 35:8 + 3-0 21 24 40°7+ 3-9 
May 14-00 13 21 2 21-1416 15 21 29:0+ 3:3 
June 7-60 7 17 2 16-7+2°1 7 14 313+ 6-2 
July 8-00 1l 8 0 11-:0+0 ll 29 27-5+ 3-6 
Sept. 4-00 10 6 0 14:0+1-8 10 46 59-8 + 12-0 
Nov. 7-00 18 10 1 25-8 + 2-6 19 58 76-14 11-4 



































One minor point is to be noted in regard to the estimates of ,. At the trapping in July 
1949, we have vi, = 11-0 + 0; in other words, it seems that the entire marked population at risk 
was captured. It is always possible, of course, that in the case of a small population and 
a high degree of sampling effort, this may happen, but if so we must have N, = C,, when we 
use the maximum-likelihood estimate of N, and in using the adjusted estimate we will have 
a value of N, <C,, as will be seen in the last column of the table. 

The standard errors of vr and NV , given in the table are adjusted to allow for the proportion 
of the population which on the average was being sampled. Thus, by summing the Vy, and 
s, columns in the table, we have f, = 147/231 = 0-636, and 1—f, = 0-364, this factor being 
then used to correct the variances. (It is interesting to compare the value of this factor with 
that used to correct the variances of the estimates of Z, given earlier in this section. There, 
1—f, = 0-302, this figure being based on the estimated y, and observed s, in Tables 11 and 
12. Those y, and s, are not, however, the same as the ones in the present table, since they 
represented the individuals in the subpopulation of marked animals which were marked 
at least twice.) It will be seen from the estimates N, of total numbers that the trappable 
Clethrionomys population on the area was not very large, amounting to probably not much 
more than 100 individuals of both sexes at the peak period of numbers during the late 
summer and autumn. From these figures we might say that this was a population which was 
on the average relatively stationary in numbers, and which was merely oscillating during 
the two years owing to the effect of an intermittent breeding season. 

In order to complete the description of this population, we also require estimates of the 
death-rate and dilution-rate expressed per unit of time. The successive survival factors 
P, may be obtained, as before, by means of 


P= Vrsald +4), 
and hence the death-rate, m = —log, B/w, 
where w, is the true interval of time between the particular samplings to which P, refers. 
bic crear A= Nual(N—d) 
and we will define the rate of increase during the interval by 


P, = log, A,/w,. 








ere 





Z 





P. H. Lestrz, Dennis CoITTY AND HELEN CHITTY 163 


This rate of increase p, between ¢ and t+dt, which we assume to remain constant over the 
interval ¢ to t+ w, is composed of a positive quantity £,, representing the rate of dilution, 
and yu, the death-rate, so that £,—, = p,. Having calculated p, and y,, we thus can obtain 
£, very simply by subtraction. The sampling errors of these estimates, however, are more 
difficult and we have been unable to solve the problems involved. The difficulty arises over 
the question as to the degree of correlation which may exist between the successive estimates 
of y, and N, which are obtained from the data treated in this way. Any such correlation 
which exists is likely to be negative in sign, and, if we were to neglect this and merely write 
down the expressions for the variances of p, and A, according to large-scale sample theory, 
we run the risk of underestimating the true sampling variances of the resulting f, and ,,. 
Leaving aside this question, however, the following values of p,, 8, and ,, expressed per unit 
of 28 days, were obtained for the Clethrionomys population: 








Dilution-rate Death-rate Rate of increase 

(B:) (2) (px) 

1948: June— — 0-305 —_ 
July- 0-475 0-283 0-192 
Oct.— — 0-067 0-294 — 0-361 
Nov.- 0-069 0-062 0-007 
1949: Mar.— — 0-009 0-071 — 0-080 
Apr.- 0-271 0-609 — 0-338 
May- 0-559 0-446 0-113 
June- 0-681 0-798 —0°117 
July— 1-005 0-462 0-543 
Sept.—Nov. 0-320 0-229 0-091 




















The first point of interest to be noted is that the three dilution-rates between October and 
April are each essentially zero, a result which might have been expected from the pre- 
liminary analysis, but which nevertheless serves asa useful check on ouroriginal approximate 
estimates of B,in Table 9, since we are now utilizing much more of the information contained 
in the data. After April the dilution-rate gradually increases during the breeding season 
of 1949. The death-rate in June and July appears to differ quite markedly in the two years, 
being about twice as great in 1949 as in 1948. It is interesting to find that the death-rate 
falls to a relatively low level during the non-breeding season between November 1948 and 
April 1949, in a similar way to that of Microtus. It then increases markedly after April, 
but the dilution-rates are at that time sufficient to keep the population relatively stationary 
in numbers until July. There is then a sharp rise in the rate of increase between July and 
September, which is presumably due to an influx of the young born during the latter part 
of the breeding season, which have grown up to become part of the trappable population. 
The broad pattern of these results thus confirms our previous knowledge of the biology of 
this species, and in addition we have obtained a numerical measure of the changes which 
occurred in the various population parameters during the period of observation. 


4. DISCUSSION OF RESULTS 


In the first two papers of the present series we were concerned with the theory of estimating 

population parameters from a series of samples, granted certain premises. We could also 

show that under conditions when these assumptions were approximately true, as in the 
11-2 











164 The estimation of population parameters from capture-recapture data 


case of a population of counters under our strict control, valid inferences could be drawn 
regarding the number of individuals and the death-rate in the population. In the present 
paper the chief emphasis has been upon testing the truth of the assumptions as they apply 
in nature. Unless such investigations can be undertaken, the parameters of natural popula- 
tions cannot be estimated with any confidence from data obtained by means of the capture- 
recapture method. 

The present field data were collected in a manner which avoided as far as possible the 
grosser errors, such as non-random marking and resampling, or interference with natural 
processes. No trapping technique, however, can sample a population in a manner that is 
independent of the behaviour of the individuals, and biased sampling may be unavoidable 
if there is no homogeneity in the response to the traps. Because this difficulty was en- 
countered, no parameters could be estimated for the Microtus population as a whole. How- 
ever, from several lines of statistical evidence it appeared that once the animals had been 
trapped they behaved thereafter with sufficient uniformity for valid inferences to be drawn 
about the marked population. Thus, although the theoretical methods of estimation 
developed in the previous papers cannot be applied in their entirety to the Microtus popula- 
tion, some important features in the latter may still be described. There is also much addi- 
tional information which we hope to use in a later paper. By contrast, the results for the 
Clethrionomys population provide no evidence by which to reject the necessary assumptions 
of the marking-recapture method of analysis. Accordingly, the death-rates, dilution-rates 
and total numbers may be given with some confidence. Even here, however, one reservation 
must be made. 

Let us first of all review the evidence on which the estimates of the total Microtus popula- 
tion were rejected and those of Clethrionomys were accepted. We knew from the isolated 
nature of its habitat that the Microtus population could not have been recruited by im- 
migration, and from laboratory examinations of other populations in the neighbourhood 
that no young were being added between October 1948 and April 1949. This absence of 
birth and immigration during the winter therefore enabled us to recognize that the apparent 
dilution of a population which was, in fact, decreasing was because the assumptions in the 
mathematical model were not valid for this population. There were no such objections to 
the Clethrionomys data during either of the non-breeding seasons; but similar evidence 
cannot be provided for the rest of the year. Parameters for the breeding seasons are there- 
fore estimated for this species on the assumption that the relevant features of behaviour in 
winter were the same at other times. Whether or not this was the case we have no means of 
knowing. The estimated population changes, however, seem entirely plausible: a rather 
slow increase in the numbers of trappable age during the breeding seasons; a fairly large 
influx at the end, and evidence that these latter animals were at risk of capture throughout 
their post-weaning existence. 

A point of general interest is the following: when a population is subject to birth and 
immigration throughout the period of sampling, is it possible to prove the validity of the 
assumptions necessary to estimate its parameters? Let us suppose, for example, that there 
had been a moderate amount of breeding by Microtus in the first winter: would we then 
have known enough to reject the estimates of population density given in Table 3? The 
problem of verifying the assumptions might not be solved if, in such a case, the field obser- 
vations consisted solely of capture-recapture data, and a more elaborate experimental 
design might be required. It might occasionally be possible to enumerate the entire 





“- @e YW 





P. H. Lestrz, Dennis Cu1tty AND HELEN CHITTY 165 


population of some species at certain times, so that a few of the estimated values could be 
compared with those observed. Perhaps a more practicable suggestion is to try sampling 
by methods which depend upon other properties than those which determine the com- 
position of the catch by the methods normally employed. These alternative methods might 
be too laborious for regular use and would then be used solely as a check. On the other hand, 
if they proved that the old methods were biased there would be no option but to make a 
complete change in the field techniques. 

Finally, it should be emphasized that the results of this analysis do not imply a universal 
law for each of the two species. The data for some other Clethrionomys population might 
not prove so satisfactory as those presented here, and for the present, at any rate, it will be 
necessary to consider each particular case in detail before deciding that these methods of 
estimation are applicable. 


5. SUMMARY 


1. Two populations of small rodents, Microtus agrestis and Clethrionomys glareolus, 
living on the same area, were sampled by means of a live-trapping technique over a period of 
very nearly two years. The resulting data are analysed here in some detail in order to see 
whether any valid estimates of population parameters can be made by the theoretical 
methods described in two earlier papers. These methods are based on the assumptions that 
the sampling of the population is entirely at random and that all classes of marked and 
unmarked animals are caught with equal facility. The main purpose of the present paper 
is therefore to see whether or not these assumptions were true in the field. 

2. The sampling of the Microtus population as a whole proved unsatisfactory, since it 
appeared that during the winter months marked and unmarked animals had not the same 
chance of being trapped, members of the latter class evidently being less willing to enter the 
traps. Since this difference in behaviour was observed during the non-breeding season, 
when no new members should have been entering the populaiion at risk of capture, we 
could not exclude the possibility that it was also occurring at other seasons of the year. In 
consequence, it was not possible to estimate the total numbers of Microtus on the area by 
methods which are based on an assumed random sampling of all classes in the population. 

3. Once trapped, however, the individuals in the Microtus population appeared to behave 
satisfactorily from the point of view of capture-recapture theory. It was therefore possible 
to use the data for these marked animals for the estimation of the death-rate, and for cal- 
culating the further expectation of life of animals marked for the first time and released in 
different months. Such figures, however, which can be estimated by the methods illustrated 
in the text, can only refer, strictly speaking, to that part of the total population which had 
been marked. 

4. The Clethrionomys population, on the other hand, behaved differently. There was no 
evidence of any differential sampling of the various classes of marked and unmarked 
animals, as in the Microtus population. Consequently, we can apply to these data, with 
some degree of confidence, the appropriate methods of estimating the various population 
parameters. 


The expenses of this investigation were partly paid for by a grant from the Agricultural 
Research Council. 











166 The estimation of population parameters from capture-recapture data 


APPENDIX 


The method of obtaining approximate estimates of N, and P, from a long chain of samples grouped by 
Method B, which was described in II, § 56, was modified in the following way. Instead of using only the 
last two entries, m,_,, and m,_,,, in each column of the original table, we construct a new table by 


t—2 
retaining each value of m,_,, and forming the sums 2 m,, which we define as n, (¢ = 2, 3,4,...,7'). 


z=0 
Including the unmarked animals, we have therefore at each sampling subsequent to ¢ = 1 the three 
classes u,, m, and my_1,¢ (M+ ™M_1,4 = %3 + U, = C;). Since y;, is the total number of marked individuals 
in the population as a whole at time ¢, 
Yo = 0, } (1) 


Vior = Plt), 
where, allowing for the d, accidental losses or removals, y, = u,—d, = R,—s, Then we have the following 


expected number of individuals in the population as a whole, together with the observed numbers in 
the three classes, at the time the sample was taken at ¢+ 1: 


Nisa — Vern Ueey 
Vir —P,R, Nery 
P,R, Me +1 
Ne+1 Cr41 
It is convenient to introduce a new parameter W, defined by 
Niws = PW 
and from (1) to rewrite this table of expected and observed numbers in the form 
PLW,—(Wity)] Uses 
Pi —%) Ned 
P,R, M+ t+1 
P,W, Coss 
from which we may write down the log likelihood equation, after cancelling P,, 
L = 4, log(W,—(%i+H)]+ Ms log (Yi, — &) — C;4, log W,. (2) 


We may now obtain the maximum-likelihood estimates W, and Ve together with their variances and 
covariance, in the usual way. Since, however, we are concerned here not so much with ¥, and W,, as 
with other parameters which are related to them, it is simpler to proceed in another fashion. It will be 
seen that we have two degrees of freedom for estimating these two parameters and that the maximum- 
likelihood equations are independent. Under these conditions we can equate expected and observed 
numbers (Bailey, 1951). Thus, in the present example, we have the pair of equations 


Nr = Or4s(We—8,)/Wi, 
M t41 = O14, R,/ Wi, 


A R 
from which ¥,= "45, (t= 1,2,3,..., 7-1) (3) 
Me 441 
A 
and Wy = ByOy1/ Mg 041+ 


This last equation may be rewritten with the help of (3), using 
&=Re-y and 41+ M41 = Sy 
in the form W,= (Vit y) Crr/eur (t= 0,1,2,...,7-1). (4) 
Then, from (1), Pi =taslity) (t= 9,1,2,....7-2), (5) 
and since N, = P,_, W;_,, we have, from (4) and (5), 
N,=¥,0,/, (t= 1,2,3,...,. 7-1). (6) 





ng 
in 


a? Soo 


3) 


(4) 
(5) 


(6) 





P. H. Lestrz, Dennis Cuttry AND HELEN CHITTY 167 


The variances and covariances of these estimates, appropriate to large-scale sample theory, can be 
obtained by means of the type of formula given by Fisher (1950, § 5&). Thus, if ¢, and ¢, are two functions 
of the observations a,, of which the expectations are m,, with La; = Xm, = n, the total size of the sample, 


0¢,\? 0¢,\? 
' ray= Bm ( a8. ° 


and a similar equation for V(¢,), while the covariance is 


ovis Em) (2) (8) (8). 


In applying these formulae to the present type of problem, there are, however, two points to be borne in 
mind. In the first place we may have a parameter which is a function of the observations from more 
than one sample, in which case we must also sum over the samples. And, secondly, it is necessary to 
distinguish between observations which enter into the expectations as given values, and the actual 
observations themselves. Thus, in equation (3) it is true that R, and s, are observed values at time ¢, 
but they enter into this expression as part of the expectations at ¢+1, which are conditional on the 
results of previous sampling. Here we regard y, as a function of the observations at ¢+ 1, and since C;,, 
does not occur explicitly, there is no term in V(y,) corresponding to n(@¢,/8n)? in (7). Thus from (7) we 
have, in terms of observed values, 


Viv) = N41? ni, R? an R341 8441 


(Me es1)?  (M041)® & (4,41)8 , 
Formally, we now should substitute expected for observed values, and write 


Viv) = WA~it%) (Ye-&) 


ral 
ROys1 








a result which may be verified by the more lengthy process of forming the maximum-likelihood equations 
from (2) for y, and W,, and then inverting the corresponding information matrix. In actual practice, 
of course, we would calculate the numerical value of V(y,) by inserting the estimated W, and y;, in this 
expression, and since W, is a parameter which was introduced merely for convenience (it is not necessary 
here to calculate its actual value), it is more practical to write 


N71 
V _— BS enh, 
he = ee 











In a similar way we have V(P,) = P3 ae + ae 
as C,—8, Viv) 
and V(N,) = wf 00, + A 


There is one further parameter we wish to determine which arises in the analysis of the subpopulation 
of marked animals, namely, Z,, the number of new additions to this population which were made at 
time t. From the theoretical point of view we therefore consider a population consisting of a variable 
number N, of individuals at the time the sample of C, is withdrawn, and we make the rule that any 
dilution of this population with new members shall only take place at the actual time a sample was 
returned. Thus, if an unknown number of Z, new members enter at time ¢, when the R, survivors out of 
the original sample of C, are returned, the total number of individuals in the population at this moment 
is N,+ Z,—d,. Then, since by definition no further new members enter during the interval ¢ to ¢+ 1, the 
total number of individuals when the sample of C,,, is taken at ¢+ 1 is 


Nias = P(N, +Z,—¢a,). 
Now, clearly (N,+Z,—d,) is the same parameter as we have defined earlier as W,, and hence from (4) 
and (6) A A 
Z, = (Very) Crs — Bo, 
t 


St41 


d,, (8) 


or alternatively iR« Rf “Cesar 1| (¢ = 1,2,3,..., 7-1). 
8M 041 











168 The estimation of population parameters from capture-recapture data 


The latter form is more convenient in practice if we merely require the estimates for a set of observed 
frequencies, but if in addition we also require the variances of these estimates, then the former is to be 
preferred. For, if we write p, = 8,/C;, q, = 1—2,, we have, proceeding as before, 


ry = (Pn) Lee) ae sa} 


All these results can be verified by considering the expected numbers in the population and the observed 
numbers in the samples taken at ¢ and ¢+ 1. Thus we have 


t t+1 
- Us PLW,-— (ity) Ut+1 
Yi- Pry Rey % Pivr— 8) Ney 
— Pra Ria _™-1,t PR, Ma t4+1 
N, C; P,W, Cr41 


from which we can easily write down the maximum-likelihood equations for the simultaneous estimation 
of the four parameters which can be determined from these sampies, namely, P;_,, y%, N, and W;, and 
obtain the corresponding 4 x 4 information matrix. By inverting the latter, the variances and co- 
variances are obtained, and finally Z, and V(Z,) from 


Z, = W.-K, +d, 
V(Z,) = V(W,) + V(N,) —2cov (WN). 


This procedure is, however, very much more lengthy and tedious than the one adopted here. 
There remains the question whether this maximum- likelihood estimate of Z,is biased. It will be noted 


from (8) that Z,i is essentially the difference between two estimates, each of which is of a type known to be 
positively biased (II, § 1). If we could assume that the absolute amount of bias in each case was of much 


the same order, then we might: regard Z, are being approximately unbiased. But it is doubtful as to 
how far this assumption would be true in general, and it is more likely that when Z,>0 the bias of the 


positive term would tend to be greater than that of the negative. Under these circumstances Z, might 
have some degree of positive bias, and in place of (8) we might therefore prefer to use the adjusted 


estimate, 
%- (Yet Ye) (Cra +1) _VdCet 1) 
; Sit] &+1 





+d,. 


A 
Since the comparison of Z, with the known true values of this parameter played such an important part 
in the argument developed in the text, it was therefore of interest to see whether the use of this adjusted 
estimate would have made any difference to our conclusions. The following were the comparative results 
obtained in the case of the data for the marked Microtus population: 











A A \ ¥ 

| True values Z; Zz, A(Z,) A(Z;) 
Sept. | 65 48-2 44:8 — 16-8 — 20-2 
Oct. | 41 51-4 46:5 +10-4 + 55 
Nov 45 38-9 37-5 — 61 — 7:5 
Mar 76 66-8 663 | — O28 —10-7 
Apr 65 111-8 107-3 +468 + 42:3 
May 46 19-6 22-3 — 26-4 — 23-7 
June 48 35-5 35-2 — 12-5 —12-8 
July 54 69-5 67-4 +155 +13-4 
Sept. 65 78-8 78-4 +13-8 + 13-4 
Nov. 108 88-9 81-3 —19-1 — 26-7 
Total 613 609-4 586-0 





























oro eh 





P. H. Lestrz, Dennis Co1TTy AND HELEN CHITTY 169 


It will be seen that, with one exception, Z.<Z, and that we do not obtain quite such a close corre- 
spondence between the estimated and known total number of Mierotus released over the whole period. 
Taking the individual 2, however, it seems that on the average they are, if anything, in slightly better 
agreement with the true values than the Z,. Thus, the sum of the squares of the deviations 2A*(Z,) = 4196, 
whereas 2A?(Z,) = 4351. These differences are, however, slight. We concluded, therefore, that if the 


maximum.-likelihood estimate Z, was biased, the degree of this bias was probably small compared with 
the sampling errors involved, at any rate in this series of data. 


REFERENCES 


Bartey, N. T. J. (1951). Testing the solubility of maximum likelihood equations in the routine 
application of scoring methods. Biometrics, 7, 268-74. 

Cuitry, D. (1952). Mortality among voles (Microtus agrestis) at Lake Vyrnwy, Montgomeryshire 
in 1936-9. Phil. Trans. B, 236, 505-52. 

Cutty, D. & Kempson, D. A. (1949). Prebaiting small mammals and a new design of live trap. Ecology, 
30, 536-42. 

FIsHER, R. A. (1950). Statistical Methods for Research Workers, 11th ed. Edinburgh: Oliver and Boyd. 

Lesuiz, P. H. (1952). The estimation of population parameters from data obtained by means of the 
capture-recapture method. II. The estimation of total numbers. Biometrika, 39, 363-88. 

Lesuiz, P. H. & Currry, D. (1951). The estimation of population parameters from data obtained by 
means of the capture-recapture method. I. The maximum-likelihood equations for estimating 
the death-rate. Biometrika, 38, 269-92. 











[ 170 } 


ON THE UTILIZATION OF MARKED SPECIMENS IN ESTIMATING 
POPULATIONS OF FLYING INSECTS 


By C. C. CRAIG 
University of Michigan 


1. INTRODUCTION 


Professor William Hovanitz called my attention to the following problem: An observer 
catches butterflies, marks them, and immediately releases them. It is assumed that a 
butterfly, no matter how many times it has been caught before, has the same susceptibility 
to capture as any other butterfly in the population which is supposed stable while the 
captures are being made. Records are kept of f,, the frequency of cases in which the same 
butterfly is caught 2 times, x = 1, 2,..., until a total of s captures of r different butterflies 
have been made. (2f, = 7; Laf, = s.) The number, f,, of butterflies which escape is not 
observed; the problem is to estimate from the values of f,, the total population n of butterflies 
on the area assumed well defined. 

The estimation of biological populations by means of capture-recapture data is by no 
means a new problem, though papers dealing with it from the mathematical-statistical 
point of view are largely quite recent. (In particular, see the papers by Leslie & Chitty 
(1951), Bailey (1951), Moran (1951, 1952), and the bibliographies quoted by them.) However, 
the experimental conditions and the mathematical models for the present study appear to 
differ in essential ways from those previously considered. The important point of departure 
is that each butterfly on being netted is immediately marked (with a spot of nail polish) 
and released. The butterflies (Colias ewrytheme) were caught in one of two isolated alfalfa 
fields, which they inhabit, in southern California. Each catch was made during the same day 
at times when the butterflies were freely flying. Thus, it seemed reasonable to assume that 
the population was stable during a catch. The experimenter, Prof. Hovanitz, endeavoured 
to give each butterfly an equal chance of capture, walking in straight lines across the field 
and deviating in direction before reaching a boundary only when he noticed that a butterfly 
just caught tended to fly down his path. One check of the suitability of a mathematical 
model is to test the agreement of the experimental results with respect to the number of 
butterflies caught once, twice, etc., with those predicted from the model. I will return to 
this point at the end of the paper. 

Two mathematical models seem appropriate to serve as a basis for discussion of this 
estimation problem. It is of some interest to see that both lead to approximately the same 
estimates with little difference in their precision for large samples. It may be of more 
interest that for both models in which the population size is regarded as a parameter, though 
maximum likelihood estimates exist and agree substantially with moment estimates in all 
sixteen of the actual field experiments for which I have data, nevertheless with increasing 
sample size meaningful solutions of the likelihood equation do not exist. 





Ss 





C. C. Crate 171 


2. THE APPLICATION OF A TRUNCATED POISSON DISTRIBUTION 
If we suppose that the total period during which captures are made is composed of a large 
number of short intervals sufficient for the capture of a butterfly and that each butterfly is 
equally subject to being netted in each such interval, it appears that one may consider that 
Zz 
E(f,) = nea 


ai (2 = 0, 1,2, ...). 


If so, the likelihood of the observed sample is 


¥ e—™ Asn! 
(LNA (2)... (n—r) If, fe! ...? 
in which n and A are parameters. 

Before proceeding to maximum-likelihood estimates, we may note that two moment 
estimates are immediately available. 


L 





(1) 





Method 1. We have E(r) = n(1—e-), 
E(s) = na. 
Equating the observed r and s to their expected values, we estimate A from 
l1—e- 
; Kt x . (2) 


Using this value of A, we estimate n from 
i = s/A. (3) 
In practice we can estimate n directly by eliminating A between (2) and (3), getting 
log n —log (n—r) = s/n, 
which is readily solved for the integer n which most nearly satisfies it by the use of a good 
table of natural logarithms. 


In correspondence with Prof. Hovanitz, Prof. Sewall Wright gave the above estimate 
and also suggested the use of the method of maximum likelihood. 


Method 2. It is still simpler to use the first and second power sums 
s(=s8,) and 8, = Xz*f,. 
For a Poisson distribution, E(s/n) = E(s,/n)—[E(s/n)}*, 


or [£(s)]}* = n[H(s,) — £(s)]. 
Thus using s and s, for their expected values we have the estimate, 
h = 87/(8, - 8). 


Though this is obviously subject to greater sampling error than the first estimate and the 
others to follow, it serves as a quickly obtained rough value which also can be used as a first 
trial value in solving the other estimation equations. 

Method 3. Let us consider the truncated Poisson distribution for which we have observed 
frequencies as a complete distribution function, i.e. we write 


eA Az 
l—e“z2! 


i.= (z = 1,2,...). 











172 Use of marked specimens in estimating populations 


8 Aes 


and equating this to s/r, we estimate A from 


r/s = (l1—e-)/A, 
just as in method 1. 
But now the likelihood of the sample is 





L A°r! 
~ (A®=1P (INA(2NK. AN fol...” 
and log L = slog A —r log (e*— 1) + terms free of A. 


oo, 
dA =A l-er 
is equation (4) again. Thus in this case the estimates obtained by the moments principle 
and by maximum likelihood agree. We next estimate n from the relation 
E(r) = n(1—e~), 
as in method 1. That is, methods 1 and 3 are mechanically the same, but under the argument 
for method 3 the estimate for A is that given by maximum likelihood. 


Method 4. We return to the likelihood function (1) in which both n and A are regarded as 
parameters. Now 


log L = —nA+s8logA+log [n™] + terms free of A and n 


Then 0 








OlogL _ s dlogL — i 1 
and ae ee 7? an wat Ee 
Equating these to zero and eliminating A, the estimation equation for n is 
ee we 


n a k=02— ke 
Now the only solutions of this equation of interest are those for which n>r. It is not 
difficult to show that such solutions exist if 
s/r<g1+$t...41/r. 


But for any fixed population, r <n, while s, the total number of butterflies caught, may be 
increased without limit (ignoring the wear and tear on the butterflies). Thus for increasing 
sample size the likelihood equation fails to have a meaningful solution, and the usual 
theorems on the asymptotic behaviour of likelihood estimates are not available. Of course 
as soo, r has n for its stochastic limit. For all the sixteen field experiments for which I have 
data, solutions n>r do exist and can be found using tables of sums of reciprocals for n’s 
up to 450 by using the estimate given by method 2. For larger n’s a good approximation to 


¥ 1/(n—k) is given by 
log n —log (n—r+1)+4[1/(n—r+1)+1/n], 
and the estimation relation on neglecting terms of O(n —r + 1)-* is 
log n — log (n—r+1)+4[1/(n—r+1)+1/n] = 8/n. 


As shown in the table at the end, these estimates agree quite well with those given by the 
other methods. 





TH 


4) 


at 


LS 


ot 





C. C. Craic 173 


3. THE APPLICATION OF STEVENS’S DISTRIBUTION FUNCTION FOR GROUPS 


A second mathematical model may be arrived at by deriving an expression for the expected 
number of times a butterfly will be caught if a total of s captures is made, if it is assumed 
that each butterfly, no matter how many times it has been netted before, is equally liable 
to capture and by considering after each capture the probability that the next will be of 
a butterfly that has been caught before. But this was seen to be equivalent to a problem 
solved by W. L. Stevens (1937) as follows: Let each of a set of objects on being drawn be 
equally likely to be assigned to any one of 7 classes. After s objects have been drawn and 
so assigned to classes at random, what is the probability that exactly r classes will be 
occupied? Stevens’s solution (which I also obtained before consulting his paper) is 


n® Are nn) 


we a ee ©) 


in which 0% is a Stirling’s number of the second kind (see Jordan, 1947, pp. 171-2). (Actually 
Stevens’s problem was not original with him either; see Jordan, 1947, pp. 177-9.) 

With this distribution law n is the only parameter to be estimated. Again this can be 
found from a sample with a given r and s either by moments or by maximum likelihood. 


Method 5. Stevens found that 
E(r) = n[1—(1—1/n)], 
and thus from a single sample we have the estimation relation 
1—r/[n = (1—1/n). 


This can be solved by trial and error by means of a good table of logarithms fairly con- 
veniently by rewriting it 
log (n—r) = slog (n—1)—(s—1)logn. (6) 
Method 6. The likelihood of the sample is given by (5). We have 


log L = logu,, = logn™ — slog n + terms free of n, 


and => —--. (7) 





This equated to zero gives the same estimation relation as in method 4 though arrived at 
from a different model. That this is to be expected can be verified by observing that if in (1) 
A is replaced by s/n, then to maximize (1) is to maximize n/n’. 

Moreover, it is readily seen that numerically methods 5 and 6 are almost equivalent. In 
practice we take in method 6 the value of n for which the right member of (7) is most 
nearly zero. Suppose that instead we seek the largest value of n for which 


(n— 1) n 
(n—1)* ~ ns" 


But this inequality reduces to 1 -" < (: - ) 


However, it is obvious that in this case, too, there are the same difficulties in regard to the 
large sample behaviour of this estimator as in method 4. 











174 Use of marked specimens in estimating populations 


Let us now derive the large sample varianoes of the other estimates which are three in 
number, since methods 1 and 3 are equivalent. 
For method 1 we proceed as follows: With A replaced by its estimate s/n, the estimation 


relation is r = n(1—e-), 


én or —e- 8s 


and from this ao ai-e-Aey (8) 





Of course o,, if % is the estimate, will be proportional to n, and we will find oj, which is 

given to O(1/n) by finding the expected value of the square of the right member of (8). We 

have then to O(1/n) 

e*(1—e)—2e* Aer +Ac™ | 1 (9) 
n(1—e-A—Ae-)? ~ n(eé—1—A)’ 





Tin = 


For method 2 proceeding in the same way, starting with n = s?/(s.—s8), we have 


dn _2(8,—8)8ds — 8*(d8, — ds) 
nn n(8_—8)? 





Taking the expected value of the square of the right member and the values of moments 
about zero of a variable obeying the Poisson distribution law, we get the simple result to 
O(1/n) in this case, 2 
Tin = na?’ (10) 


As is to be expected, this variance is larger than for method 1. 
For method 5 the result is a bit more cumbersome. We first get 


dn (n—1)ér 


n  #(s+n—1)—ns8’ 





in which on the right ¢ satisfies the relation 


r= nf a - (1 -7)']. 
n 
Next we must find o?. We have 1\8 
E(r) = nf 1(1 -;) i 


n—1 
ns—) 





anditiseasy tofind [r(r—1)]= [n* — 2(n — 1)® + (n — 2)]. 


1\ 2\8 1\* 
Using these of = n(1-2) +n(n—1)(1-2) ~n*(1-2) ‘ 
n n n 
This quantity can be so small for even moderately large n’s even though the last two terms 


are large that it appears necessary to calculate it just as it stands. We have to order O(1/n) 


pw (n Fe 1) 0, 
7am o(e+n—1)—ne" sais 
I will illustrate the use of these various formulae for estimates of n and their variances 
with a pair of numerical examples from data of Prof. Hovanitz. 











C. C. Craia 175 








Example | 
2 | Ss 
0 ae 
1 66 8=72, 8,=78 
2 3 
69=r 


By method 1 (or 3) on solving log — log (n— 69) = 72/n we get n = 840. Our estimate 
of A is then 72/840, from which 


1 
ea ee Re, DiS wean” 
TH = sae —1—-ay 1 
and Tam = 0-561. 
722 
By method 2, = 738-72 = 864, 
(2\t (2n\# 
Can = (=) = (=) = 0-577. 
By method 4, on solving 
1 1 1 72 
log n —log (n— 68) +5(— 5 +5) = =" 


we get n = 828. By method 5 (or 6) we also get n = 828. For method 5 we have 
Trin = 0-55. 


All of these values of the standard deviation of the proportional error of the estimate of 
n are high. This illustrates the fact that for an » of this magnitude a catch of only about 
8 % of the population was not an adequate sample for the estimation of the total. 


Example 2. Let us see how matters turn out in another example in which the estimated 
population is about the same but the total catch is some six times greater. 








Data Estimates and standard errors 

a f Method n Tin 

0 =e 1 and 3 856 0-0871 

1 258 2 901 0-0976 

2 72 4 853 Pr 

3 3 5 853 0-0867 
—— o—_—_———— 6 853 cae 

341 =r 





s= 435, 8,=901 


The following table is a summary of the results for sixteen experimental catches. 

Finally, I return to the question mentioned earlier, of testing the suitability of the models 
used to the experimental conditions. For the first model using the estimates of n and 
A obtained, it is easy to calculate the predicted frequencies for x = 0,1,2,.... For the two 
examples given in detail using the estimates found by methods 1 or 3, the frequencies 
calculated for x = 0,1,2,... are: Example 1: 771-0, 66-1, 2-8, 0-1; and Example 2: 515-1, 
261-7, 66-5, 11-2, 1-4, 0-1. A x? test is superfluous for the first; for the second the significance 
level is above 0-5. I have similarly tested the remaining fourteen examples for which I have 
data. The significance level for x? was above 0-5 in every case but one, in which it was 





176 Use of marked specimens in estimating populations 


approximately 0-3. The second model does not lend itself to a similar check. For that I only 
have to offer the fact that the estimates obtained by its use agree quite well with those found 















































from the first model. 
Observed values Estimates of n by method | Values of can for method 
r 8 8g land 3 2 4,50r6| land3 2 5 
69 72 78 840 864 828 0-56 0-58 0°55 
93 108 140 352 364 348 23 +25 23 
159 187 249 560 564 557 “17 “18 “17 
144 161 197 708 720 703 +22 24 22 
63 76 108 196 180 193 0-24 0-25 0-24 
56 66 92 195 168 192 +28 +28 *28 
48 74 188 79 86 78 “14 “18 “14 
341 435 645 856 901 853 -087 “098 087 
276 330 450 895 908 892 0-11 0-13 0-12 
222 249 303 1063 1148 1059 18 “19 18 
154 225 385 277 316 275 -090 +112 -090 
148 180 256 444 426 442 . 215 16 15 
63 91 155 116 129 114 0-15 0-18 0-14 
60 98 216 91 81 90 ‘ll 13 ‘ll 
71 118 258 105 99 104 “10 “12 -098 
46 89 211 59 65 58 092 13 -088 
REFERENCES 
Battey, N. T. J. (1951). On estimating the size of mobile populations from recapture data. Biometrika, 
38, 292-306. 


JORDAN, CHARLES (1947). Calculus of Finite Differences, 2nd ed. New York. 

Lestiz, P. H. & Currry, DENNIs (1951). The estimation of population parameters from data obtained 
by means of the capture-recapture method. I. The maximum likelihood equations for estimating 
the death rate. Biometrika, 38, 269-92. 

Moran, P. A. P. (1951). A mathematical theory of animal trapping. Biometrika, 38, 307-11. 

Moran, P. A. P. (1952). The estimation of death rates from capture-mark-recapture sampling. 
Biometrika, 39, 181-8. 

Stevens, W. L. (1937). The significance of grouping. Ann. Eugen., Lond., 8, 57-69. 





ne. | 


a a ae a ae ae ee a a aa ae se a a aa ee | 


ee ee. ee. ee 








[177 ] 


THE TOTAL SIZE OF A GENERAL STOCHASTIC EPIDEMIC 


By NORMAN T. J. BAILEY 
Nuffield Lodge, Regent’s Park, London 


1. INTRODUCTION 


The early work on the mathematical theory of epidemics (e.g. Ross,* 1916 and later; 
Brownlee, 1918; Kermack & McKendrick, 1927 and later; Soper, 1929) was invariably of 
a ‘deterministic’ nature, and assumed that for given numbers of susceptible and infectious 
individuals, and given attack and removal rates, a certain definite number of fresh cases 
would occur in any specified time. However, it is widely realized that an appreciable 
element of chance enters into the conditions under which new infections or removals take 
place. The probability approach was fundamental to Greenwood’s (1931, 1946) use of chain 
binomials in discussing the distribution of multiple cases of disease in households. Recent 
discussions of these problems (e.g. Bartlett, 1946, 1949; Bailey, 1950) have therefore turned 
to ‘stochastic’ models. In these we have, for any given instant of time, probability dis- 
tributions for the total numbers of susceptible and infected individuals replacing the single 
point-values of the deterministic treatments. Stochastic models have a special importance 
in this context due to the fact that for epidemic processes stochastic means are not the same 
as the corresponding deterministic values. Although for large homogeneously mixing groups 
deterministic methods might be fairly adequate, it seems likely that in practice epidemics 
actually occur in several relatively small groups of friends and acquaintances, the epidemio- 
logical returns for an administrative unit being compounded of many such comparatively 
distinct processes. Moreover, when we are considering the distribution of cases of a disease 
in a household the size of the group is always so small as to demand a stochastic treatment. 

In my 1950 paper I discussed a simple stochastic epidemic where none of the infected 
individuals was removed from circulation by death, recovery or isolation. This might well 
apply to some of the milder infections of the upper respiratory tract, and can also be used 
approximately to represent epidemics for which the time taken for removal from circulation 
is long compared with the time usually required for the epidemic to be completed. The 
present paper considers the more general problem of allowing for both infection and 
removal. The analytical difficulties present in the treatment of the simple epidemic appear 
here in a more acute form, though it has proved possible to compute the frequency dis- 
tribution of the total size of the epidemic for moderate group size given the ratio of removal 
to infection rate. The results obtained may be compared with those described by Kermack 
& McKendrick (1927) in the deterministic case. No obvious analogue to the threshold 
theorem can be discerned in the stochastic models. An important application of these 
results is to the problem of the distribution of multiple cases of disease in a household, and 


a method is given for obtaining maximum-likelihood estimates of the ratio of the removal 
to infection rate. 


* Although Ross started with the idea of probability his mathematical theory is essentially 
deterministic. 


Biometrika 40 12 








178 The total size of a general stochastic epidemic 


2. DETERMINISTIC TREATMENT 


We must first glance briefly at the results obtained in the deterministic case. The following 
treatment, with constant infection and removal rates, is substantially that given by Kermack 
& McKendrick (1927), though with some slight alterations to their notation. 

Consider a homogeneously mixing community of n individuals, of whom at time ¢ there 
are x susceptibles, y infectious cases in circulation and z individuals who are isolated, dead, 
or recovered and immune. Thus we have 


e+yt+z=N. 


Now suppose that there is a constant infection rate # and a constant removal rate y, so that 
the number of new infections in time dt is Say dt and the number of removals from circulation 
is yydt. We can choose our time scale so that ¢ is replaced by ft. Then it is easy to see that 
the course of the epidemic is represented by the differential equations 





oa 

a 

ae TY PY (1) 
dz 


where p = y/f, the ratio of the removal to infection rate. Initially, when t = 0, we can 
assume that x is approximately equal to n. It is then clear from (1) that unless p<7 no 
epidemic can start to build up as this requires [dy/dt]_,>0. Kermack & McKendrick 
obtained an approximate solution to (1) for epidemics of small magnitude and showed that, 
if p = n—v, where v is small compared with n, an epidemic of total size 2v will occur. This 
constitutes what may be called Kermack & McKendrick’s Threshold Theorem, and can 
be interpreted by saying that if the initial density of susceptibles is n = p+v then the 
introduction of a few infected persons will give rise to an epidemic, after which the density 
of susceptibles is reduced to p—v, a value as far below the threshold p as originally it was 
above it. Somewhat similar results can be obtained for the more general case of variable 
infection and removal rates, and an extension can be made to the situation where an 
intermediate host is involved (Kermack & McKendrick, 1927). 


3. STOCHASTIC TREATMENT 


Let us now consider the stochastic analogue of the deterministic treatment discussed in the 
previous section. We shall use the same definitions of x, y and z, and shall replace t¢ by ft as 
before. Then on the assumption of homogeneous mixing of the susceptibles and infectious 
individuals in circulation the probability of one new infection taking place in time dt is 
xydt, while the probability of one infected person being removed from circulation in time 
dt is pydt. Let p,,(t) be the probability that at time ¢ there are r susceptibles still uninfected 
and s infectious individuals in circulation. Let us assume that the epidemic is started by the 
introduction of a infectious cases into a population of n susceptibles. It is now easy to show 





on. 
at 





Norman T. J. BAILEy 179 


by the usual methods that the whole process can be characterized by the partial differential 
equation for the probability generating function II: 


ell roi O| oll 

no OE it ge 

yg Te (2) 
where Il = > w'v"p,,, (3) 
with limits O<r+s<nta, O<r<n, 0<s<n+a. (4) 


Equation (2) is substantially that given by Bartlett (1949, equation (49)), putting his 
immigration rate equal to zero. 


Let us now use the Laplace transform and its inverse with respect to time given by 


$*(a) = i “e™ p(t) dt, R)>0, 
1 ctiao (5) 
HO = 55, MAA, 


ct+iao c+iw 
where | = lim , and ¢ is positive and greater than the abscissae of all the 
c-io wo c—iw 


residues. Taking transforms of (2) and (3), and using the boundary condition 





Pna(9) =1, (6) 
o?T1* oll* 
j 2... a ts * Nya — 
we obtain (v2 — uv) Sudo +p(1—v) . AII* + uv? = 0, (7) 
and II* = Lwe'py, = LD wv'drs, (8) 
r,8 rT,8 
where Vrs = Prs = | ep, 4(t) dt. (9) 
0 


Substituting (8) in (7), and equating coefficients of wv’, yields the recurrence relations 


(r+ 1) (s— 1) pia, s—1— {8(7 +P) +A} Gyg + (8 + 1) Gy, 941 =0 
and —{a(n+p)+A}Qnat] = 0, (10) 
with O<r+s<n+a, O<r<n, 0<8s<n+a. 


Any q,, whose suffix falls outside the prescribed ranges is taken to be identically zero. It is 
evident from the form of the equations that, starting with q,,,, all the quantities g,, could 
be calculated successively. Using the inverse of the Laplace transformation, we could then 
arrive at the required p,,, exhibiting them as sums of exponential terms like e~*/+)', There 
seems to be considerable difficulty in handling such expressions in a compact and convenient 
way to give, for example, epidemic completion times or the stochastic epidemic curve 
showing the rate of change with respect to time of the average total number of removals at 
any instant. However, some progress is possible if we concentrate attention on the total 
size of the epidemic, i.e. the value of n +a—<x for t = 00. As too all terms in the expansion 
of p,,(t) involving negative exponentials like e~*/+' vanish unless i = 0. The non-vanishing 
term is the coefficient of A-! in the partial fraction expansion of q,, in terms of {i(j +) +A}-’. 


12-2 











180 The total size of a general stochastic epidemic 


Now the epidemic ceases to spread to fresh susceptibles as soon as s = 0. Thus the probability 
of an epidemic of total size w (not counting the initial a infectious persons) is 


P,, = lim Pn—wolt) (O<we<n), 
t>o 
= lim a 
A>0 


= lim pqn_w,1, putting r = n—w and s = 0 in (10), 
A—>0 


= Pfn—w,v (11) 
where Sve = lim G,,; 
a0 (12) 
for l<r+s<nt+a, O<r<n, l<s<n+a. 


The quantities f,, evidently satisfy the following recurrence relations obtained from (10) by 
writing f,, for g,, and putting A = 0, 





(r+ 1) (s— 1) fy41,0-1—8(7 + P) Sirs +p(s+ 1) S;,0+1 = 0 | (13) 
and —a(n+p)fngtl = 0, 
with same limits as in (12). 
Some further simplification results from writing 
n! (r+p — lip*te-" 
= — : 14 
hrs orlintp) 9 (14) 
Substituting in (13) gives 
9r+1, 8-1 — Irs t+ (r+p)"? Jr, s+1 = 0 (15) 
and Inq = 1. 


I am indebted to Dr F. G. Foster for suggesting to me the alternative approach of considering 
the succession of population states represented by the points (r,s). Thus the progress of the 
epidemic can be regarded as a random walk from the point (n,a) to the points (n—w, 0) 
w = 0,1,...,”, with an absorbing barrier at r = 0, and where the possible transitions from 
ha (r,8)—>(r—1,8+ 1), occurring with probability r/(r +p), 
and (r,8)—>(r,8—1), occurring with probability p/(r +p). 


Foster’s general formula for P,, can now be written down almost immediately simply by 
considering the sum of the probabilities of all possible paths from (n, a) to (n—w, 0). Thus 


we have 
n 
a Re cterst (") > ( ty ti a 1)-" ( 2 he ty 16 
perme p+n)-*(p-+n—1)-*...(p+n—w)-*e, (16) 
Ww 








where the summation is over all compositions of a+w—1 into w+1 parts such that 
0<a;<a+i-—1 for 0<i<w-—1 and 1<a,,<a+w-—1. However, for the purposes of com- 
putation there appears to be some advantage, especially if n is at all large, in calculating the 
quantities P,, from (11), (14) and (15), instead of from (16). The reason for this is that not 
only is the form of (16) not very suitable for computation, but also the partitional nature of 








Norman T. J. BAtLEy 181 


the summation may leave some doubt as to whether all relevant terms have been included 
in.any specific instance. Using (11), (14) and (15), therefore, the P,, have been calculated 
over a suitable range of values of p, for n = 10, 20 and 40, and taking a = 1 as a standard 
initial condition. Some typical results are shown in Figs. 1, 2 and 3. It can be seen from 
Figs. 1, 2 and 3 that when the relative removal rate p is large epidemics tend to be small, and 


06 


0-5 


Probability (Pw) 
? 2 
a rs 


2° 
= 


0-1 








05 


rs 


3 


Probability (P.) 


Ms 


01 











0 T t T r T 0 
0 6 8 10 0 5 10 15 20 
Final total size of epidemic (w) Final total size of epidemic (w) 
Fig. 1. Diagram showing the probability of the Fig. 2. Diagram showing the probability of the 
final total size of the epidemic for groups of final total size of the epidemic for groups of 
ten susceptibles, starting with the introduction twenty susceptibles, starting with the intro- 


of one new infectious case. 


duction of one new infectious case. 


06- 


O5- 


e— -P= 40 n= 40 


Probability (Pw) 











0 10 20 30 40 
Final total size of epidemic (w) 


Fig. 3. Diagram showing the probability of the final total size of the epidemic for groups of 


forty susceptibles, starting with the introduction of one new infectious case. 











182 The total size of a general stochastic epidemic 


conversely. There is a fairly gradual transition between the two extremes, though for some 
intermediate values of p most of the probability is accounted for by the two ends of the 
distribution. For example, with p = 5 for n = 20, there is a 20 % chance of no additional 
cases and a 64 % chance of 19 or 20. Again, there is only a gradual drop in the average size 
of the epidemic with increasing p. Specimen values for the range p = 0 (0-25n) 1-50n are set 
out in Table 1 below. There is no obvious analogue of the Threshold Theorem derived by 
Kermack & McKendrick (1927) for the deterministic case. 


Table 1. Average total size of epidemic for various values of p and n 
(not counting initial case) 








p n= 10 n = 20 n= 40 
0 10-00 20-00 40-00 
0-25n 7-13 14-40 29-12 
0-50n 4-33 7-97 15°31 
0-75n 2-74 4-32 6-94 
1-00n 1-89 2-62 3°58 
1-25n 1-38 1-77 2-18 
1-50n 1-08 1-30 1-50 




















4. HOUSEHOLD DISTRIBUTION OF CASES 


The methods of the foregoing section may also be employed to investigate the distribution 
of multiple cases of a disease in households. This problem was first examined statistically 
by Greenwood (1931) who considered the hypothesis that, with a fairly infectious disease 
like measles, the first case in a family would arise from an outside contact while subsequent 
cases would occur through contacts within the family. The period of infectiousness is thought 
to be short for measles and if reduced, for the purpose of simplification, to an instant, 
Greenwood showed that the course of the intra-familial epidemic may be represented by 
a chain of binomial distributions. The frequencies of the final number of cases observed can 
then be found in terms of a parameter p, which is a measure of infectiousness. Such a model 
is quite adequate for measles, and satisfactory tests of goodness-of-fit were obtained for 
the data available, merely by equating observed and expected means. However, for 
diseases like diphtheria which have a more extended period of infectiousness there is probably 
some advantage in using the concepts of infection and removal employed in this paper. 
Equations (11), (14) and (15) can be used as before to calculate the quantities P,, for small 
values of n, e.g. 1 to 5, keeping in p as a parameter and taking a to be unity. There is some 
advantage, for simplicity in handling the algebra, in partially solving the recurrence relation 
in (15) to give g,, as a linear function of g,,, ;,i = (8-1), ...,(n—r). The requisite formulae 
are easily found to be 
n-?Tr 
Irs wa ter Gr+1,i (s>1), 


, 17 
with In = Grail (r +p), ( ) 


and Ini = 1. 











Norman T. J. BAarLEy 183 


















































me 
“es Table 2 5 
ral n=1: Py=pl(pt+1) P=a,/a, 1,=N/p(p+1)* 
ize P, = 1/(e+1) 
set “Ane 
by n=2: Py=p/(p+2) 
P, = 2p?/(p + 2) (p+1)* 
P, = 2(2p + 1)/(p +2) (p+ 1)? 
GL _a+2a,, 2a, 2Aat+a,) N_ 
dp” p 2p+1 pt+l = p+2 
n=3: Py=p/(p+3) 
P, = 3p?/(p + 3) (p+ 2)* 
P, = 6p*(2p + 3)/(9 + 3) (9 + 2)? (p+ 1)8 
| Pz = 6(5p* + 12p? + 8p + 2)/(p + 3) (p +2)? (p+ 1)* 
| dL + Go + 2c, + 3a, 2a, 4 (15p*+ 24p+8)a, 3(@,+43) 2(a,+a,+a,) N_ 
dp p 2p+3 5p*+12p?+8p+2 pt+l pt+2 pt+3 
n=4: Py=p/(p+4) 
P, = 4p*/(p + 4) (p +3)? 
| P, = 12p°(2p + 5)/(p + 4) (0 + 3)? (p+ 2)% 
| Ps = 24p*(5p* + 27p* + 47p + 27)/(p + 4) (p+ 3)? (p+ 2)° (9 + 1) 
P, = 24(14p* + 93p5 + 235p4 + 2939? + 197p* + 74p + 12)/(p + 4) (9 +3)? (9 + 2)8 (p+ 1)4 
dL _ao+2a,+3a,+4a, 2a, | (15p?+54p+47)a, 
dp ? + 9045+ 5p? + 2Ip? + 4ipt27 
a + (84p* + 465p4 + 940p* + 879p? + 394p + 74) a, 
14p° + 93p° + 235p* + 293p? + 197p? + 74p + 12 
lly _A(agtay) B(agtast+a4) Xayta,+as+a,) N_ 
use p+ p+?” p+3 p+4 
ont 
sht n=5: Py=plip+5) 
nt, P, = 5p*/(p +8) (p+ 4)? 
by P, = 20p%(2p + 7)/(p + 5) (p + 4)? (9 +3)* 
an Ps = 60p*(5p* + 42p? + 116p + 106)/(p + 5) (p + 4)? (p +3)? (p + 2) 
del Pia 120p5(14p% + 177p5 + 910p4 + 2443p? + 3626p? + 2836p + 918) 
Son (2 + 5) (p +4)? (9 + 3)8 (p+ 2)* (p+ 1) 
120(420" + 596p° + 3604p* + 12,240p7 + 25,941 p* + 36,144p5 
for P.= + 34,061p* + 21,952p? + 9456p? + 2448p + 288) 
bly - (0 +5) (p+ 4)? (p + 3)8 (0 + 2)4 (p+ 1) 
er. dL _a)+2a,+3a,+4a,+ 5a, 2a, (15p? + 84p + 116) a, 
all dp- p + 2p+7 7 5p? + 42p*+ 1l6p + 106 
a (84p5 + 885p + 3640p? + 7329p? + 7252p + 2836) a, 
149° + 177p5 + 910p* + 244303 + 3626p? + 2836p + 918 
jon (420p® + 5364p* + 28,832p7 + 85,680p* + 155,646p° + 180,720p4 
lae + 136,244p* + 65,856p? + 18,912p + 2448) a, 
+ 42p1 + 5096p" + 3604p* + 12,240p" + 25,94 1p" + 36,144p° + 34,061p* 





+- 21,952p* + 9456p? + 2448p + 288 
_5(Gg+G5)_ 4(Gg+agt+a5)  3(ag+43+4,+45) 
17) | pt+l pt+2 pt+3 
2(a,+a,+a,+a,+a;) N 
, p+4 ~ p+5 

















184 The total size of a general stochastic epidemic 


On the other hand, with values of n as small as 1 to 5, it is probably just as easy to derive 
the P,, straight from Foster’s formula (16). 

We cannot expect to be able to estimate £ and y separately as the asymptotic distribution 
of epidemic size for infinite time yields no information about the time scale. For this we 
should require data giving the time intervals between successive infections in families with 
two or more cases. 

Having calculated the P,, for any given family size, we can throw the results into a form 
suitable for the maximum likelihood estimation of p. N is the total number of families of 
a given size; and a,, is the observed number of families with a total of w cases in addition to 
the first one. The case n = Ois trivial. For x = 1 there are simple expressions for the amount 
of information as well as the maximum likelihood estimate, while for n > 2, the information 
functions become increasingly awkward to handle, and the simplest procedure is to use the 
well-known method of calculating the observed amount of information from sufficiently 
close values of the score. The values of the P,, and the corresponding score for n = 1, 2,3, 4 
and 5, are set out below in Table 2. 

The values of P,, can be conveniently checked by ensuring that their sum for any given 
n is unity. Although the scores for the larger n contain some awkward looking polynomials 
in p, there is little difficulty in practice, with the aid of Barlow’s Tables and a calculating 
machine, in computing the score at a few trial values of p for the purposes of inverse inter- 
polation. However, if such methods were to be used at all extensively it would be worth 
while considering the construction of special tables. The scores are linear functions of the 
observations and the coefficients of these observational quantities could be tabulated over 
a wide range of values of p. 

Suitable data for the application of the above methods do not seem to be available, or at 
any rate are not readily accessible, apart from the material on the 1926 St Pancras measles 
epidemic used by Greenwood (1931). As already mentioned, Greenwood found that the 
chain-binomial model gave a satisfactory fit and so we should hardly expect the present 
model to give a very adequate description for measles, though it might for other diseases. 
This is in fact the case. I have carried out the maximum likelihood estimation of p on 
Greenwood’s data for families up to total size 5, and in no case is a satisfactory fit obtained. 
It is therefore hardly worth giving the details of the calculations. The epidemiological 
implications are, however, important, for it has thus now been shown that in the case of 
measles a satisfactory fit is to be obtained neither by the simple binomial distribution to be 
expected if the disease were not highly infectious within families (Greenwood, 1931) nor 
by the distribution expected when there is both infection and removal of the types discussed 
in the present paper. On the other hand, the chain binomial model used by Greenwood 
(1931), appropriate to very short periods of high infectivity, is adequate. 


5. SUMMARY AND CONCLUSIONS 


An investigation has been made of the total size, i.e. for infinite time, of a general 
stochastic epidemic involving both infection and removal by recovery, death or isolation. 
For small homogeneously mixing groups no analogue has been found of the Threshold 
Theorem derived by Kermack & McKendrick (1927) for the deterministic case. In stochastic 
models wide variations in the size of an epidemic can occur purely by chance with fixed 
infection and removal rates. This may have important consequences for the interpretation 











Ww 


i ee ~~ Ww = 


ss lUOlUle ll stlCUSlCUO ee 








185 


of epidemiological data. An application to the problem of the distribution of multiple cases 
ot disease in a household is also considered. It is shown how maximum likelihood estimates 
of the ratio of removal to infection rate can be obtained from suitable data, and the 
appropriate maximum likelihood scores are given for families up to a total size of 5 (not 
including the first case). The model under discussion is not suitable for diseases like measles 
involving short periods of high infectivity, but its adequacy for other infections requires 
to be tested. 


Norman T. J. BAILEy 


I am indebted to Miss Eva Rowland for undertaking the computations, on which Figs. 1, 
2 and 3 and Table 1 were based. 


REFERENCES 


Battey, Norman T. J. (1950). A simple stochastic epidemic. Biometrika, 37, 193. 

Barrett, M. 8. (1946). Stochastic Processes (notes of a course given at the University of North 
Carolina, 1946). 

Barttett, M. §. (1949). Some evolutionary stochastic processes. J.R. Statist. Soc. B, 11, 211. 

BROWNLEE, J. (1918). Certain aspects of the theory of epidemiology in special relation to plague. 
Proc. Roy. Soc. Med. (Sect. Epid. and State Med.), p. 85. 

GREENWOOD, M. (1931). On the statistical measure of infectiousness. J. Hyg., Camb., 31, 336. 

GREENWOOD, M. (1946). The statistical study of infectious diseases. J.R. Statist. Soc. 109, 87. 

Kermack, W. O. & McKeEnprick, A. G. (1927 and later). Contributions to the mathematical theory 
of epidemics. Proc. Roy. Soc. A, 115, 700; 138, 55; 141, 94. 

Ross, R. (1916 and later). An application of the theory of probabilities to the study of a priori patho- 
metry. Proc. Roy. Soc. A, 92, 204; 93, 212; 93, 225. 

Soper, H. E. (1929). Interpretation of periodicity in disease-prevalence. J.R. Statist. Soc. 92, 34. 











[ 186 ] 


EXPERIMENTAL EVIDENCE CONCERNING CONTAGIOUS 
DISTRIBUTIONS IN ECOLOGY 


By D. A. EVANS 
King’s College, Newcastle upon Tyne 


INTRODUCTION 


In order to get evidence about the applicability of theoretical contagious distributions, as 
much data as possible on plant and insect populations have been collected together. The 
goodness of fit has been considered for three contagious distributions. They are the negative 
binomial, the Pélya-Aeppli, and the Neyman Type A. The general conclusions are that, on 
the whole, plant quadrat counts are fairly well fitted by the Neyman Type A distribution, 
while insect counts are fitted by the negative binomial distribution and not by the other 
two distributions. It is suggested that this difference is possibly due to a greater degree 
of competition and overcrowding in the case of plants. In Part I experimental evidence 
is given and discussed. Mathematical formulae and charts required are given in Part IT. 


Part 1. EXPERIMENTAL EVIDENCE 


1-1. Description of data 


The plant quadrat counts and insect population counts analysed in this paper are arranged 
in four tables. Plant counts made at eight British localities are given in Table 1, while 
Table 2 refers to counts made on three types of American prairie. Some counts of insects 
and larvae are given in Table 3, and two counts of moth eggs are given in Table 4. The 
various sources from which these data have been collected are as follows: 

(i) Counts numbered 1-24, Table 1a, were made by Archibald (1948) and consist of 
samples from five maritime and two grassland communities. The type of community to 
which a count belongs is indicated in Table 1 a by a letter at the head of the count. The key 
to this letter code and the names of the species concerned are given in Table 1B. The 
populations were sampled by means of a 500sq.cm. quadrat split up into twenty-five 
smaller quadrats, each of 20sq.cm. Frequency distributions were then made up from 100 or 
500 contiguous 20sq.cm. quadrats. Except where otherwise stated, plant species with less 
than an average of 0-6 of an individual per quadrat have been excluded from the present 
discussion. Counts with low means give poor discrimination between the three theoretical 
distributions. 

(ii) The remaining six counts in Table 1 a, numbered 25-30, have been given by Barnes 
& Stanbury (1951). The observations were made upon uniform level expanses of virgin 
mica dam, several feet thick and some 5000sq.yd. in area, the waste product of clay pits. 
Various sizes of quadrat were used, and the individual dams from. which the six sets of 
counts were taken were classified by Barnes according to their moisture content, height and 
stage of colonization. A summary of this information is given in Table 1c. 

Barnes’s data deal with the initial colonization of virgin area by immigration and the 
subsequent development of colonies by reproduction. With the exception of the willow, 







































































D. A. Evans 187 
Table 1a. Frequencies for various plant species 
Number of 
individuals la 2a 3a 4a 5a 6a Ta 8b 9b 10b 
0 4 12 15 39 57 82 60 26 — 45 
1 3 8 17 23 6 4 18 36 —- 25 
2 8 9g 28 17 12 4 19 27 3 ll 
3 13 13 18 13 5 3 2 10 2 4 
4 ll 6 9 4 5 1 ae -- 4 8 
5 9 8 12 4 5 2 es _ 8 3 
6 8 ll --= — 7 1 1 1 6 1 
8 7 10 7 —_ 1 1 —— —- 1 1 
8 3 8 1 — — 13 — 
e 9 3 7 — 1 1 10 2 
e 
10 8 3 1 1 5 —_ 
n ll 3 4 — me 8 
, 12 4 1 5 
13 4 1 8 
r 14 = — 5 
e 15 3 a= 3 
16 2 1 6 
6 17 1 ae 6 
18 — — 2 
19 — 1 1 
20 — — 2 
21 ai es 1 
22 — — _ 
23 2 skid 1 
1 24 — ws ad 
25 — —_— 
4 26 ~ ih 
] 27 ee = 
29 we se 
30 1 is 
f 31 —_ — 
34 i si 
; | 
. N 100 100 100 100 100 100 100 100 | 100 100 
; k, 6-99 5:05 2-31 | 1-32 1-58 0-67 0-68 1-26 | 10-59 1-37 
? ke 27-56 14-53 | 2-68 | 2-00 5-42 3°37 103; 1:12; 22-53 3:71 
- ks 226-81 44-52 2°51 | 2-61 18-85 20-79 2-07 1-21 31-08 14-00 
| 
| 
) Method 1: 4 2-94 1-88 0-16 0-51 2-43 4-04 0-51 0:00 1-13 1-71 
T». +67 — 15-7 —1-00 | —1-27 | —8-22 | —4-37 | +0-08 | —0-05 | —35-5 | —0-39 
S.E. (T'p.) 54 19 1-2 1 7:5 9 0-5 0-35 30 4 
N.A +97* — 68 —0-97 | —1-:09 | —3-56 | +1-:09 | +0-17 — 28-8 +1-61 
S.E. (T'y.4.) | 44 17 1-2 1 5-4 4 0-45 30 3-2 
. 
Method 2: 42 0-80 3-62 4-75 0-66 1-43 
Up, —0-38 | —1-88 | —0-48 | —0-10 + 0-38 
s.E. (Up) 0-29 1-16 0-84 0-13 0-44 
42, 1-82 2-09 0-72| 260! 324| 0-60 1-20 
Twa. + 7-85*| — 1-05 —0-27 | —0-27 | +0-53 | —0-06 +0-70* 
s.E. (Uy) 3-0 2-0 0-25 0:64 0-34 0-1 0-34 
Note. In all the tables an asterisk against the value of a test statistic indicates that it is greater in 
magnitude than twice its estimated standard error: two asterisks denote that it exceeds three times its 
estimated standard error. 











188 Experimental evidence concerning contagious distributions in ecology 


Table 1. (cont.) 



























































| 
Number of 
individuals | 116 126 i3¢ l4c 15d | 16e 17f 18f 19f | 20g | 2lg 
0 15 75 1 274 88 —_ 181 237 354 52 18 
1 5 10 | 165 71 101 an 118 110 58 32 3 
2 9 3 | 27 58 | 101 al 97 78 41 13 7 
3 6 2 42 36 84 1 54 30 25 3 9 
4 ater 4 77 20 54 2 32 23 ll oe 12 
5 3 2 77 12 30 3 9 8 3 8 
6 ave 2 89 10 16 2 5 6 4 8 
7 1 1 57 7 16 6 3 3 3 3 
8 1 _ 48 oT. 7 1 dige ee 9 
9 sep 24 | o..4+>4 6 4 mr 5 
10 << i ans 1 8 1 1 1 
11 ee eas oe 10 4 
12 9 1 4 — 
13 3 5 4 
14 i 2 1 
15 7 2 
16 7 2 
17 2 2 
18 5 1 
19 2 1 
20 3 — 
21 2 
22 3 
23 | 2 
24 2 
25 | — 
26 3 
27 1 
29 1 
30 2 
31 1 
34 1 
N 100 |100 500 500 500 100 (500 |500 500 |100 100 
ky 0-71| 0-78 580! 1-31 2-43) 14-21) 1-41) 1-18 0-66! 0-67| 5-71 
ks 2-35; 339 | 643) 423 | 4-01] 4639) 2-25! 2-70 1:75| 0-67| 23-64 
ks 9-67| 19-26 7-63| 1834 | 7-36) 250-47; 3-99, 9-36 6-37| 0:56| 99-81 
Method 1:4} 2-31| 3:34 | O-11| 2-24 | 0-65| 226| 0-59| 1-29 1:66| 0-00) 314 
q —1-64|—2-39 |-0-18|—1-54 | —1-38/+30-41|-—0-69|+0-68 |—0-26|—0-11|—44 
s.E. (Tp) 35 | 7 18 | 27 1:3 | 95 0-6 1-1 0-89} 0-13| 44 
NA. +0-25| +1-96|—0-14/+1-73 |-0-86|+66-8 |-—0-44/+1-66 | +0-64 —16 
s.E. (Ty.,) 2 4 18 18 1-1 83 05 | 0-9 0-6 36 
| | 
Method 2:42 | 2-94' 3-42 2-34 | 0-78| 1-17 1:81 4-66 
Up. |—0-45 | — 0-06 —0-14 —0-26|+0-14 |-0-10 — 8-67 
s.E.(Up) | 0:50! 0-64 0-29 0-14) 0-15 0-13 4-95 
42, 2:19! 2-49 1-82 | 0-71 0-70! 1-00 1-47 3-19 
wa. + 0-08 | + 0-67* +0-55* | -—0-13 —0-15 | +0-34** | +0-13 — 0-30 
s.E.(Uxy,) | 0:26| 0-31 0-19 | 0-23 | 0-12 | 0-11 0-08 3 

























































































D. A. Evans 189 
Table 14 (cont.) 
Number of 
individuals 229 239 249 25h 26h 27h 28h 29h 30h 
21g 
0 12 53 65 101 124 41 150 66 63 
18 1 22 32 18 25 76 14 34 19 19 
3 2 19 10 10 26 33 5 28 9 ll 
7 3 17 2 5 36 17 2 8 2 6 
9 4 15 3 2 18 2 1 4 2 1 
12 5 6 — _ 21 —_— — 2 1 — 
8 6 5 6 _- 2 1 
8 7 2 4 1 —_ —_ 
3 8 Z 6 es 
9 9 -— 5 
5 10 3 
11 1 
1 
4 Pills OSA ter ask, HES eh 
= N 100 100 100 252 252 64 228 100 100 
4 ky 2-61 0-70 0-61 2-22 0-80 0-64 0-67 0-62 0-63 
1 ke 3-69 0-90 0-99 6-55 0-93 1-44 1-33 1-29 0-94 
4 ky 5-06 1-4] 1-64 19-98 0-97 5-31 3-28 3-64 1-33 
2 Bae 
I Method 1: 4 0-42 0-28 0-62 1-95 0-16 1-25 1-00 1-08 0-50 
1 T». — 1-48 + 0-03 — 0-45 — 7-85 — 0-24 +0-77 — 0-38 — 0-07 — 0-47 
S.E. (Tp) 2 0-32 0-52 4-7 0-19 1:25 0-66 1-00 0-43 
Tra. —1:26 |+0-06 | —0-33 | —3-64 | —0-23 |+1-27 | -—0-05 | +0-:29 | —0-39 
s.E. (Ty) 2 0-3 0-48 4 0-19 1-20 0-5 0-7 0-4 
Method 2: 4? 0-21 0-83 2-86 0-88 1-18 0-98 0-73 
P. +006 | —0-13 | —2-03* +0-24 | -0-12 | +006 | —0-15 
s.E. (Up.) 0-08 0-14 0-79 0-18 0-13 0-16 0-13 
42), 0-43 0-20 0-74 2°15 0-24 0-78 1-02 0-86 0-66 
NuA. —0:04 |+0-06 | —0-:07 | -—0-44 | —0-:06 | +0-30* | —0-01 | +0-:13 | —0-10 
s.E.(Uy,) | 0-54 0-07 0-11 0-49 0-06 0-15 0-1 | 0-12 0-1 
Sika Table 1s. List of plant species and localities referred to in Table 1a 
) Plant Species 
71 la Salicornia stricta 16e Salicornia stricta 
3-64 2 Plantago maritima 17f Carex flacca 
81 3 Limonium vulgare 18 Briza media 
4 Triglochin maritima 19 Helictotrichon pratense 
aries 5 Armeria maritima 20g Thymus serpyllum 
14 6 Festuca sp. 21 Festuca sp. 
1 7 Suaeda maritima 22 Bromus erectus 
! 8b Glaux maritima 23 Carex sp. 
; 9 Juncus gerardii 24 Briza media 
; 10 Aster tripolium 25h Hypnum schreberi 
11 Triglochin maritima 26 Spergularia rubra 
— 12 Plantago maritima 27 Juncus effusus seedlings 
-66 13c Glaux maritima 28 Juncus effusus seedlings 
67 14 Plantago coronopus 29 Juncus tussocks 
9 5 15d Carex arenaria 30 Juncus tussocks 
-19 Localities 
"30 a Limonium Salt Marsh, Blakeney Point e Salicornia Salt Marsh, Blakeney Point 
| 6 Juncus Marsh, Havant f Chalk Grassland, Pitstone 
arn c Glaux Low, Blakeney Point g Chalk Grassland, Otford 
d Carex Dune, Blakeney Point h The Devonshire Moors 














190 Experimental evidence concerning contagious distributions in ecology 


Table lo. Classification of the habitats sampled in locality h, Table 18, together 


with the size of quadrat used 
25 L, D, H, 39 sq.cm. 28 L, H, 600 sq.cm. 
26 I, D, H, dense, 39 sq.cm. 29 L, L, 2500 sq.cm. 
27 =I, L, 156 sq.cm. 30 L, H, 2500 sq.cm. 


IorL Initial or late stage of colonization. 
W or D Wet or dry. 
HorL Highland or lowland areas. 


Salix atrocinerea, the underlying vegetation was completely obliterated by the mica deposits. 
In view of the wind dispersal and the uniform surface one would expect a Poisson dis- 
tribution for a quadrat count of willow seedlings, and this was in fact borne out by a sample 
of sixty-four quadrats of area 156sq.cm., for which the mean number of seedlings per 
quadrat was 0-734. The water used to transport the mica in suspension was derived from 
moorland streams and probably carried in the separate protonemata of the mosses. Each 
protonema then gave rise to a number of upright shoots, and counts of the latter gave 
evidence of contagion, as shown in Table 1 a, species numbered 25h. As soon as the dams 
dried out, the herb Spergularia rubra was found. The count given (26h) was taken on ground 
bare of parent plants. It seems likely that small aggregates of seeds were carried in by 
animals, which frequented the dams as soon as they were firm enough. The rush, Juncus 
effusus, was one of the earliest colonizers, and was also one of the most persistent. The 
count of Juncus seedlings (28h) was made on an older dam with numerous well-established 
but widely spaced clumps of rush, from which seed was possibly distributed by wind and 
by grazing animals. 

(iii) Turning now to the plant quadrat counts made in America, Table 24 gives some 
results for two series of counts made by Steiger (1930) in Nebraska. Steiger gave the number 
of individuals per quadrat for all species found in two sets of forty quadrats. Each quadrat 
was 1 m. square, and those species with less than an average of one individual per quadrat 
have been excluded from the present discussion. The two sets of quadrats were placed in 
vegetation described as ‘high-prairie’ and ‘low-prairie’ respectively. The quadrats were 
selected at regular intervals along four parallel lines, and a detailed study of numerous small 
areas showed that the propagation of prairie species was largely vegetative. The results of 
calculations on counts for twenty high-prairie and eighteen low-prairie species are given in 
Table 24. For both prairie types, the results are arranged in two groups according to 
Steiger’s classification of the species into grasses and non-grasses. The species within each 
of the groups are arranged in descending order of density, and numbered accordingly. These 
numbers are also given in Table 28, which gives the letter code, used by Steiger, for the 
names of the species. 

Clapham (1936) has previously discussed Steiger’s data, and he showed that, for the 
great majority of species, individuals do not follow the Poisson form of distribution. 

(iv) The remaining American plant count is given in Table 2c. This count was made by 
Hanson (1934) on the Native Prairie of western North Dakota, and gives the number of 
stalks of the grass Agropyron Smithii per quadrat for each of 384 quadrats of area 0-1 sq.m. 
Sampling was by restricted randomization, as follows. Twelve large areas, each split into 
eight subplots, were laid out systematically over a fairly level plateau top. Each subplot was 
5-5 yards square and four samples were taken from it by placing a wooden quadrat, 2-5 by 





owl 
o> 
= 





































































































6-L+ +-0- rE+ oI+ + 101+ LLt+ I> o+ u— (Pu) VN 
e+ oa g-0- b+ L-0- os + e1- i= o+ °u—(°u)"*y 
OLT+ a 8-6L+ Leet GLet 193°F+ ie = pe9'og — 199°Z — vaL 
29 — 6I- PLE + L61+ abe + 91 — £82‘21 — as _ aul 
008 — 1g- Tg — so + 601+ 08s'h — 9o6'Es — <= sone iL 
LPL 99-2 96-¢ SF-9 81-9 16-61 90-08 09-8 ol-el wP 
GL-09L BS-OF SE-FES 00-94 Z0-8LL Ser'rl LOT‘ES $98'93 — Le | 
86-LE 90-31 18-92 O1-0¢ 80-09 99% 19L 628 SLg‘T *y 
29-3 08-8 OFS €L-9 L6-9 83-33 2-68 +98 Ill 4 
& &I L t 8 € or T 0 u 
10 y 1S! vd uo 8V og od {Vv seroods 
6 8 L 9 ¢ + € 3 I Suryuey 
nm ‘ 
Z ajduns arsvoid-mo) ‘sabpas pup sassniyy : 
> 
oa a 
: p+ 9-6+ F-0+ e-r+ e+ FI+ sI+ 61+ 03+ e-L+ b+ 9+ Ou — (x) VN 
< I+ 8-é+ L-1I- 90+ = Ot ¢ + 11+ It+ 8+ 9-0+ 90+ Ou — (°u) aay 
a sh + += 93 — tt Cee £08‘T + sgl — si9‘e + 86FF3 + 1Z0°3 — ror o+ 6LI‘L — vaL 
6s -| ss —| 66 - ItI- stI‘I—| 999 — £06°% — LEL‘'gI— | zest, + LLO‘S— | LSZ‘S —| o8s‘9T— 41 
9c3— | SOI-| PLI-| SIS—| POL‘I-—| vEI's—| srOo'o—| esI'se—| IE's —| see'6—| seooI— 189°93 — baer 7 
99-41 L9-9 SLE $L-¢ O1-01 99°61 82-91 PP-Sb FS-98 OL-9T PLO £833 mp? 
6LE 8hZ 992 eL¢ LLO‘T 909°L 8eeF 898° ZI1‘I9 OPEL F98°ZI 698‘FT *y 
96-61 o-88 8-0¢ PIL FEI $92 082 828 096 20g 189 806 *y 
Lo1 £0-9 L-O1 9-01 T-3I 8-31 o-91 8-LI £-9 ¥-8Z P88 6-88 ty 
es L 3 z 6 L S 0 0 0 I u 
10 dp u Dd dg ag re) 4 od ey og fv seroeds 
a II or 6 x L 9 g t e 3 I surpuey 
ajdws atuvsd-ybry ‘sabpas pun sasspiyy 
(mop 8.4619) syouponb Q% UO pasng yova ‘syun0d yunjd UDoLUaW 
qybra-hys1y) sof suoyngrysrp houanbasf worl payojnayns sousymg “Vz e[Qe I, 
nooks a 2do pr» ouvdsd oO Se gcomewagogadoo [>] bw 5 © 8 hb 








- 


VZ 21Q0,J, Ut 07 passafas sarveds 0) hay ‘az 2[QBI, 

























































































$0 8-0 6-T z € ? 9 rat Q (a) as 
¥9-0— 6L:0+ ort G+ LIt+ Lo+ 1g¢+ 1¢+ ZI- iat 
LL-0 OFT 82-E FS-€ 02: €8-9 IL€ LU8 LL-6 (BP 
> L-0 il 8-£ ' ¢ gt Il ze 89 (4q) ‘#8 
L-0- o-O+ 10-:0- I+ I+ G+ 6g+ s- eZI—- ‘a 
iS 18-0 SLT LE-¢ 62-9 96-2 ZL‘8 09-9 FE-F1 FS-LT @P 
Y) 
§ 0-3- est Let 9o-o+ LIt+ 8+ sI+ gt TI- *u— (*u) wa 
g oY FO+ 1-0- L-O+ 6g + I+ 8 + 2-0- o-9- °u—(°u) 47 
‘S Lg-I- LL-O— 946+ 0t+ est+ sor+ ZIt't+ Lie + os9 — bbe 
3 SL:I—- +8-§— o6 — 02 - FOI+ *8I— 919 + 1st — SIZ‘I— wal 
La) €6-1—- 16-9 — l-sr— 19- 9z + GLP— or + 082‘I — cos‘ — Fr, 
SS 
& SF-0 $81 FES 99-¢ LL-g 02-6 SFT LO-#1 86-8 wP 
a 
88-2 GB-LT 08-28 OF-9FT €8-FZF Z£-088 6128'S o8F'Z ZP-ES6 “v 
S 19-3 LUG 90-ST 28-91 18-1 PL-LO ¢9-Z8 €2-621 SF-SFI 4 
aa) SLT 28-1 88-3 BS-% OL-F oF-9 29-9 82-8 LG-FT a 
> rat ST 1Z 02 9 rat L val 6 u 
E a 80 4d Oo V sg H 8 Ww setoeds 
Sy 6 8 L 9 g F € z if Sunuey 
's 
- ‘ — 
: ajduns arsvpid-no} ‘saveds fissn.1b-wo yy 
© iz Té+ 6+ L+ 6-0+ c+ SI+ OoI+ + uu — (ru) Voy 
s T-0+ 9+ e+ oI- e+ gs + + ot mu — (°u) Ia 
< os + €6-93 + L-9T+ rE — oo-9+ 0g0'T + 8oI'T+ 36 — VAL 
ES le — 13-81 + 9% — 16-9 — 06-%+ LL + sez — €81— wee 
bd L@t- 8F-0I+ 8-13— 69-01 — GL-0- 9. cog‘ — GLZ—- bale 7 
3 98-8 o8-8 99-4 90-2 o9-T 29-81 08-12 08-F wP 
N F2-2E 89-99 82-28 99-§Z 08-62 sis‘z 19a‘ 822 ®y 
‘Ss 18-9 86-¢ 82-01 99-9 OL 96-29 ical LE-Z9 Sy 
S O81 LEI 98-1 LUZ 89-3 82-8 20-9 88-6 ty 
S ¥Z 8I 02 ST 9 1Z 02 0 u 
i) 
wy 78) H aq d Ss a D soroeds 
nN 8 L 9 g t g 5 I Suryuey 
& | 





ajduns arummid-ybry ‘squay 42Y1Q 








































































































Sr) ; SLG+  Mw—(%Uu) YNZ 
4 +1 (YX) ‘a's 296 (VX 7) a's I 801 6g + + v 62 6 I 2 
#6h1 + ie | 9691 + var I 96 3 8¢ ¢ eh L 83 L €I 
F8-L Vee ty ueursony : I S6 £ Lg € oF & LZ 8 at 
: 83+ °u— (x) Aq I 8 z 9s 3 I? L 9% L II 
08 (49) ‘as O8ZI (42) ‘a's I 6L I o¢ ¢ 0 L G3 g Ol 
grt 49 99L— <1 I 69 7 +g + 68 L ¥ 6 6 
69-81 Pp  ‘nddey-edjog ie I 89 I €¢ 8 8 S £% II 8 
fas 6SI— %u—(%u) tXg 3 L9 I 3g i LE g a4 6 L 
LL (*%Q) ‘as Ogst (TX 7) as I 99 I Ig + 9 6 IZ Ol 9 
#093 - een LZ — bata’ ae 9 & os L st 6 02 81 g 
6L:93 Tgp ‘yeuruourq eayeson 3 +9 I 6% + +8 L 61 L + 
68-F1 wP g £9 3 8 L && 6 8I LI £ 
~— <9 z Lt oI ce 8 LI II 3 
96-FZEL *Y = 6LBS YY I 19 3 9% L 1g L 91 9 I 
88-098 Sy P88 N 3 09 3 oF 6 08 L SI 1Z 0 
8019819819 +u 4 +u 4 4u 4 4u 4 4u 4 
is (nop ¢ uosunz7) nyyrug uoikdoisy fo syynjs fo saqunu ay) fo uoyngrysip houenbasg *OZ e[Qe I, 
< *yerpenb aed [SNprlAIpuUl euo UBY4} Sse] JO Ayisuep 810 ‘eouesqe o40]du100 JOYE O}PBOIPUI OF posN sI YSep VY 4 
<a} 
< nasods ndyg = 5 L ? 
sidajosajay snjoqosodg = dg 8 = 
‘ sysuagoid DOT Od % 3 
a UNUDILONUN * 7 
nunssaqn)5 obppyog =} g + wnuniiougiis unnuDg vq 6 9 
opyfiydobin*q tq = L Dysi19 M4900 ol 8 
opungioy vapwsg J + = sisuappune snug = 1 aI 6 
snunsvaqoos snyyupyeH = =-H 9 € nuvajisuued *Q dp Il = 
wngobasn wnysinby = = sq i 3 uppem rasvQ WD 9 g 
snsowns Uuosabreg aq c 6 oynpuedyina*g og Zz £ 
unpwons WNUyOLIDYDH 8D L 8 phyomsobyo *g 
siqsaduno on ? IV 3 I Dnssy Dnojynog =§=g ¢ 4— 
snioyuynu ssp wy 8 9 sniupdows *y sy £ ¥ 
suaosauno Dyd.sowp V I g snypoinf uobodospup fy I I 
ust aor] 4atH ary 2 
sorwedg seredg a 
x 
ed4é4 o1rerg ed 44 o1resg g 
' g 
2 
sqiey 10430 seZpes puv sossviy " 
VZ 2QD,], Ut 07 passafas sarveds 0) hay “az 21QBI, 
CoH roay a.Tt LA | e | , a oT as CVWNA\enwoe 








194 Experimental evidence concerning contagious distributions in ecology 


Table 34. Frequency distribution for eleven insect population counts 











(data given by Beall) 
Number of 
individuals la 2a 3a 4a 5b 6b 7b 8c 9c 10c lle 
coe 
0 19 24 43 47 | 190 20 33 117 205 162 227 
1 12 16 35 23 | 264 ll 12 87 84 88 70 
2 18 16 17 27 | 304 6 5 50 30 45 21 
3 18 18 ll 9 260 6 6 38 4 23 6 
4 11 15 5 7 294 6 5 21 2 5 1 
5 12 9 + 3 219 1 —_— es _ 2 oe 
6 7 6 1 1 183 1 2 2 — 
7 8 5 2 1 150 2 2 2 
8 4 3 2 —_ 104 3 2 — 
9 4 4 _ — 90 — ou 1 
10 1 3 1 60 1 1 —_— 
11 — —_ 1 46 2 — 
12 1 1 —_ 29 1 — 
13 1 —_ 36 1 _ 
14 -— 19 pas oe 
15 1 12 1 _ 
16 = ll 1 — 
17 1 6 — 1 
18 _ 10 2 — 
19 1 | 9 = we 
| 
20 — | 4 — — 
21 — 1 1 
22 — 3 — 
23 — 4 — 
24 -- 1 — 
| 25 --- 1 at 
26 1 ae = 
27 a ss 1 
28 1 — 
36 ~~ J 
| 45 1 
| = 60 
j N 120 120 120 120 | 2304 70 70 325 325 325 325 
} 
ky 4:03 3-17 1-48 1-51 4-74 6-10 2-14 1-40 0-50 0-85 0-41 
ke 16-45 7-77 3-19 3-63 15-00 | 113-13 14-10 2-33 0-58 1-14 0-5 
ks 155-11 19-75 9-67 15-63 83-83 |3787-50 | 162-75 4-72 0-72 1-54 0-72 
























































Table 38. List of insect species and treatments referred to in Table 34 


Insect species: a The European corn-borer, Pyrausta nubilalis Hubn. 
6 The Colorado potato beetle, Leptinotarsa decemlineata Say. 
ce The beet webworm, Lozostege sticticalis L. 
Treatments. Counts 1-4 were made on plots given two applications of fungus spores, at the following levels 
in grams per acre. 
Application Application 


Count on 8 July on 19 July 
1 0 0 
2 0 40 
3 40 0 
4 40 40 


Count 5 was made on a potato field near Chatham, Ontario, on 14 August 1935. No information is given about 
the treatments given to the beetles in counts 6 and 7. Counts 8-11 were on plots given various chemical treat- 
ments, as follows: 

8 No spray 10 Lead arsenate 
9 Contact insecticide 11 Contact spray and lead arsenate 





ma 18 fF fF 

















Is 


ut 
at- 





D. A. Evans 195 


4-0dm., in each quarter. The vegetation is described as ‘mixed prairie’, and A. Smithii is 
one of the two dominant tall grasses found on the plateau: 

This concludes the description of the plant quadrat counts. 

(v) Passing on to the counts of insect and larvae populations, counts numbered 1—4 and 
6-11 in Table 34 have been given by Beall (1940). One of the series of counts had to be 
excluded because it was not given in full. Table 38 gives the key to the species letter code, 
and the key to the number code for the treatments applied to the insect populations. 

Counts 1-4 were made in an experiment (Stirrett, Beall & Timonin, 1937) on the control 
of the European corn-borer by the fungus Beauveria Bassiana Vuill. The fungus treatment 
on 8 July occurred at the beginning of the period of oviposition and the second treatment 
at its height. The borer lays masses of about twenty eggs, which hatch in July and reach 
full growth in August, so that the population sampled was effectively of one age. Counts 


Table 4. Frequency distributions for two counts of the number of moth 
eggs per maize plant (Marshall’s data) 




















Frequencies (n, n, Statistics 
Number (r) i ug 2nd count 
of eggs per r ean — 
plant Ist count | 2nd count tinued Counts Ist 2nd 
0 204 101 19 1 N 780 782 
1 143 72 20 3 k, 2-48 5-75 
2 128 89 21 1 k, 7-40 32-10 
3 107 84 22 2 ks 39-43 383-55 
4 71 66 23 2 
5 36 40 24 rl Method 1 
6 32 56 25 — Gm) 1-98 4-58 
7 17 43 26 1 T + 2-67 + 57-25 
8 14 49 ee 1 s.E. (Tx) 4-4 40-4 
9 q 28 28 | — a +7:55* | +117-62* 
10 7 39 =... s.E. (T'p,) 3:3 28 
ies Se ee Method 2 
13 3 12 32 1 42, 2-08 5:07 
15 1 12 34 3 s.E. (Uys) 0-49 2-6 
16 1 10 ws re un. 11-9 40-3* 
17 2 ll 51 1 D.F. 9 18 
18 l 7 soe ma x3. 95 | 342% 
a@ 1-70 3-62 
Uy. +0-71* +5-51* 
s.E. (Up.) 0:34 | 1-5 





























were made of the number of borers on the unit area occupied by a hill of corn, and there had 
been two possible periods of migration of the larvae prior to the examination on 19 October. 

The Colorado potato beetle lays groups of 20-30 eggs, and the two counts 6b and 76 were 
made on the number of larvae per 4 ft. strip of potato row. The population sampled ranged 
from larvae newly hatched to completely mature larvae. 

The beet webworm lays from one to five eggs, and the larvae usually mature on the plant 
on which they have hatched, although they can move about freely. Counts 8c-llc were 
made on the number of larvae present on unit areas of 3 ft. of row, and the population con- 
sisted mainly of mature larvae with some half-grown larvae. Beall suggests that the contact 


13-2 








196 Experimental evidence concerning contagious distributions in ecology 


insecticide, acting as an irritant, might possibly increase dispersion, whilst the arsenate 
might be expected to depress it. 

(vi) Beall (1938) has also given an account of some earlier work on the Colorado potato 
beetle. A plot of potato plants forty-eight rows wide and 96 ft. long was chosen for exam- 
ination. By running strings transversely to the rows of potatoes at intervals of 2 ft., the 
area was split up into 2304 sampling units of 2 ft. lengths of row. The frequency distribution 
of the number of beetles per unit is given in Table 34, number 50. 

(vii) Finally, Table 4 gives two frequency distributions obtained in a field sampling 
study, by Marshall (1936), of oviposition by the moth, Heliothis obsoleta Fabr. Two counts 
of the number of eggs per maize plant were made for all the maize plants in a plot 23 by 
24 yards. All the eggs found during the first count were destroyed, so that the second count, 
which took place a week later, was independent of the first count. 


1-2. Analysis of data 

The purpose of the present analysis is to see whether one can choose between various 
theoretical contagious distributions and say that, for a particular type of data, one form of 
distribution gives a better fit in general than the others. No claim is made that the typical 
form of distribution will always give a good fit. Individual cases will certainly arise where 
the fit is poor, and without further data it is impossible to say whether poor fits are a 
persistent feature for counts of certain species. 

Because of the poor discrimination between alternative theoretical distributions usually 
shown by plant and insect population counts, it seems necessary to examine as a whole 
a large number of counts of each type, as many as one can get in fact, with no personal bias 
in the selection. The problem is then to detect whether one theoretical distribution always 
tends to err in a particular direction, whilst another gives a satisfactory median fit. 

The Neyman Type A, Pélya-Aeppli and negative binomial distributions have been 
fitted according to the methods given below in Part II. These three distributions were 
chosen as being typical of the various forms of contagious distribution that have been 
proposed by various writers (see Anscombe, 1950). They are in increasing order of skewness 
and in decreasing order of proportion of zeros, for given mean and variance. 

In order to test the relative adequacy of fit of each type of distribution, a statistic T or 
U has been calculated for each (sometimes both statistics were calculated). The statistic 
T is the difference between the sample estimate of the third cumulant and its expected value 
found by using the sample estimates of the first two moments; U is the difference between 
the sample estimate of the variance and its expected value found by using the sample 
estimates of the mean and proportion of zeros. A large positive value of either statistic 
suggests that a more skew form of distribution would give a better fit, while a large negative 
value suggests a less skew form of distribution. The relative sensitivity of the two statistics 
in testing goodness of fit depends on the values of the parameters of the parent distribution. 
The statistic 7’ is relatively easy to calculate so that its value has been given throughout, 
and the value of U has been given whenever it is expected to be the more sensitive statistic. 
In general when the mean is small, for a given sample size, no appreciable discrimination 
between alternative forms of distribution is possible; and therefore counts with small means 
have been omitted, as stated above in the description of the data. 

The general conclusion which appears from the application of these tests is that, on the 
whole, plant population counts are fitted by a less skew form of distribution than that which 








D. A. Evans 197 


fits insect population counts. The explanation of this difference may be that in some 
manner the two types of population are differently affected by overcrowding and by com- 
petition. For the plant and insect populations considered here, the plants are larger in size 
on the average than the insects, and they may therefore be more prone to suffer from over- 
crowding, in the sense that their physical size prevents high densities per unit area from 
occurring. Competition by other species may be relatively slight in the case of insect com- 
munities on plant hosts if, as seems likely from the relevant literature, only a small number 
of insect species are able to tolerate one kind of plant host. 

The population counts discussed here do not show any indication that a more skew form 
of distribution than the negative binomial distribution would give a better fit. However, 
in the course of searching the literature for insect population counts, four series of insect 
parasites on animal hosts were found, which showed a greater degree of skewness than could 
be fitted by this distribution. It may be possible to discuss these counts in a later paper, but 
so far no more excessively skew series of this type have been found. 

Referring back to the thirty plant counts given in Table 1 4, it will be seen that only a few 
of the values of the statistics 7’ and U (marked with an asterisk) differ from zero by more 
than twice their estimated standard error. Taking the group as a whole, the Neyman Type A 
distribution (N.A.) gives a satisfactory median fit and the Pélya-Aeppli distribution (P.) 
gives a significantly high proportion of negative values. Values of T' and U for the negative 
binomial distribution (N.B.) are not given, but they also show a significantly high 
proportion of negative values. 

In order to investigate the general spread of the statistics 7 and U for the Neyman Type A 
and Pélya-Aeppli distributions, the value of the more efficient statistic for each count was 
divided by its standard error and converted by a table of probits into a probability value. 
This use of a probit conversion is only strictly valid if 7 and U are normally distributed and 
the standard error precisely known. One would expect, however, that the error committed 
is not serious. It may be as well to stress here that all the theoretical results used in this 
analysis are based on large-sample theory. On the null hypothesis that the distribution 
fitted is the correct one, a rectangular distribution in (0,1) would be expected for the 
probability values. After dividing the interval into five equal parts, this hypothesis was 
tested by the x? goodness of fit test. The values of x? (with 4 degrees of freedom) were 5-7 
and 12 for the Neyman Type A and Pélya-Aeppli distributions respectively, so that the 
latter distribution does not give a satisfactory fit. 

The conclusion, that the Neyman Type A distribution gives a satisfactory median fit, 
may be verified by calculating and comparing expected and observed group frequencies, 
and this has been done in those cases where there are a large number of observations. The 
results are given in Table 5, using the same key as in Table 1a. The Neyman Type A dis- 
tribution was fitted by using the sample mean and proportion of zeros (Method 2, Part II 
below), except for count 13c, where the fitting was by moments (Method 1). Expected 
frequencies for the Pélya-Aeppli distribution fitted by Method 2 are also given for count 14c. 

The value of x? for the goodness of fit test and its number of degrees of freedom are given 
at the foot of each count. The total of 53-8 for x? with 42 degress of freedom is not significantly 
large, nor is there any consistent failure in fitting at certain points, so that the conclusion 
given above is not discredited. ; 

It may be noted before passing on that for count 14c, the x? test is sufficiently sensitive 
to discriminate between the two forms of distribution—the result agreeing with the test 








198 Experimental evidence concerning contagious distributions in ecology 


based on the U statistic. The U test is also able to discriminate between the two forms of 


distribution for the count 18f, but here, as is more usual, the y? test fails to do so. 


Table 5. Expected frequencies for the Neyman Type A distribution fitted to some 
plant population counts from Table 1 











































































































13c l4c 15d 17f 
Number of 
individuals j 
Obs. Exp. | Obs. | N.A. | P.A. Obs. | Exp. | Obs. | Exp. 
| | 
0 16 12:8 274 274-0 274-0 88 88-0 181 181-0 
1 } 71 58-0 75:9 101 105-0 118 127-1 
2 27 28-4 58 58-9 51:5 101 99-9 97 89-0 
3 42 51-3 36 43-6 34:4 84 78-2 54 52-0 
4 17 70-7 20 27-6 22-7 54 53-9 32 27-3 
5 17 79-5 12 16-4 14-9 30 33-8 9 13-2 
6 89 715°8 10 9-5 6 16 19-7 
7 57 63-1 7 5-4 2 16 10-8 
& 48 46-7 7 5-6 
9 24 31-2 | 9 10-4 
10 14 19-1 12 6-4 10-8 
ll 16 | 108 3 5-2 
12 and over 13 10-7 
N 500 500 500 500 
x? 12-2 12-8 4 5-5 3-8 
D.F ; 9 6 6 7 4 
| | y 
18f 19f 25h 26h 28h 
Number of 
individuals | 
Obs. Exp. Obs. | Exp. | Obs. | Exp. | Obs. | Exp. | Obs. | Exp. 
0 237 237-0 354 354-0 101 101-0 124 124-0 150 150-0 
1 110 102-9 58 53-8 25 26-2 76 717-8 34 36-2 
2 78 73-8 41 43-5 26 31-6 33 33-7 28 22-8 
3 30 42:8 25 25-5 36 27-8 8 11-0 
4 23 22-6 ll 12-6 18 20-9 
5 8 11-3 J) 21 14-8 
6 | 6 10-3 
H } 
10 11-6 > 19 16-4 
8 > 8 8-0 
9 L 4 9-7 > 11 10-6 
10 
ll | 9 78 
12 and over J ] 
N 500 500 252 252 228 
x? | 7-5 0-7 8-6 0-5 2-2 
D.F 4 | 3 6 2 
| 








Note. The theoretical distributions fitted are all Neyman Type A except in the case of count 14c 
to which a Pélya-Aeppli distribution was also fitted. 














D. A. Evans 199 


None of the three contagious distributions considered here gives a satisfactory fit to the 
thirty-eight plant species referred to in Table 24. When the fitting is done by Method 1, the 
Neyman Type A distribution gives the best fit as far as third moments are concerned, but 
goes wrong at the lower end of the frequency distribution, predicting too many zeros in 
thirty-four cases out of thirty-eight.* The negative binomial distribution would predict 
about the right proportion of zeros, but would generally be far too skew. 

These conflicting results may be due to a number of causes. The communities sampled by 
Archibald were well-known natural forms of a more or less stable character and were chosen 
for their uniformity of life form. There were therefore no difficulties of sampling due to 
widely differing life forms. The degree of cover is more or less complete in the chalk grass- 
lands, and especially in the Salicornia and Limonium marshes, where the vegetation forms 
a low compact mat. The structure of the prairie vegetation showed great variation, with 
several species at times locally dominant. The character of this community is entirely 
different from that of the communities studied by Archibald. Many of the grasses, the 
Andropogons and Boutelouas in particular, are sod (that is, tussock) forming with foliage 
from 30 to 65 cm. high. The formation of dense sods of Bouteloua is encouraged by the annual 
mowing of the area. Andropogon furcatus has a rank growth with widely spreading tops 
producing almost complete cover, yet it may occupy only 20 % of the surface area. In the 
case of the sod-forming species it seems certain that Steiger has given the number of flowering 
stems per quadrat, but he is not definite about this. 

No experimental field work seems to have been published about the effect on the shape 
of the frequency distribution produced by using samples based on different quadrat sizes. 
Consequently it is not easy to judge the significance of the fact that Steiger’s quadrat size 
is 500 times as large as the 20sq.cm. quadrat used by Archibald. From the mathematical 
point of view, the three distributions considered here are only additive with respect to the 
parameter m; that is, if two independent random variables have the same type of distribu- 
tion, but with parameters (m,,a) and (m,,a) respectively, then their sum also has the same 
type of distribution, but with parameters (m, + mg, @). 

Although the quadrat size was so much larger, Steiger only found from ten to twenty 
stems per square metre for each of the more abundant species, and about 300 stems for all 
species combined. The corresponding figures obtained by Archibald were one to two and 
fifteen, so that the English communities had about twenty-five times as many individuals 
per unit area, with fewer species having about the same, high, average number of individuals 
per unit area. 

It is possible that the Neyman Type A distribution is the correct form of distribution, and 
that the annual mowing and the higher number of equal competitors have spread out the 
individuals of each species more evenly than usual, thereby reducing the number of zeros 
observed. 

An alternative conjecture is that the negative binomial distribution is the appropriate 
distribution, and that the upper tail has been compressed because any high densities are 
physically unlikely for large, rank-growing, species. 

If ny is the observed number of zeros and E(n,) is its estimated expected value obtained 
by fitting the distribution by moments, then E(n9)—m = V say, and U = k,—k,(1+ 4) 


* Le. thirty-four out of the thirty-eight values of Zy , (m9) — 7. shown at the bottom of the table are 
positive. 








200 Experimental evidence concerning contagious distributions in ecology 


have the same sign for any of the three distributions considered here; and further, since 
Uy.a. > Up, > Uys, (@ > 0), it follows that 


Ey 4(%) — 9 > Ep (M9) — Ng > Ly p (Mo) — No- 


This means that the statistics U and V can be regarded as equivalent as far as the study 
of the proportion of positive to negative signs is concerned. Thus if thirty-four positive 
values and four negative values of V are found for the Neyman Type A distribution, thirty- 
four out of the thirty-eight values of U would also be positive. 

The statistics Up and Uy, are given for comparison for the nine low-prairie, non-grassy 
species. Their estimated standard errors have also been given, although for N as low as 
40 they will not be very reliable. 


Table 6. The Neyman Type A distribution fitted to five high-prairie, 
non-grassy species from Table 2 



































4P 5z 6H 7Cs 8Am 
Number of 
individuals 
Obs Exp Obs. | Exp Obs Exp Obs Exp. | Obs Exp 
0 6 6-0 15 15-0 20 : 20-0 18 18-0 24 24-0 
1 12 7-7 10 7-4 5 2-9 
2 7 78 } 13 | 106 } =} ss 6 6-0 4 3-7 
3 | 1 3-3 
: } 7 | 12 \ 7 7-9 } 4 | 6-8 } 6 8-6 1 2-3 
5 and over 8 7-3 6 6-6 5 | 5-8 5 3-8 
| 
x 41 0-4 2-9 1-7 — 
DF. 2 1 1 1 oo 
Una. +2:5 +0-4 +3-9 +2-9 +13 


























The Neyman Type A distribution has been fitted to five high-prairie, non-grassy, species 
by Method 2. The results are given in Table 6, together with the values of x? and the statistic 
Uy 4.. There are not enough degrees of freedom to apply a x? test to the fit for count 8Am, 
and the total x? of 9-1 on 5 degrees of freedom for the remaining counts is not significantly 
large. Inspection of individual frequencies confirms the picture given by the U statistics. 

The final plant count given in Table 2C is also for a tall prairie grass. Gray’s Manual of 
Botany gives 3-15dm. as the height of the stems of Agropyron Smithii. The area was 
sampled with an 0-1sq.m. quadrat, and was selected according to the following criteria: 
‘typical and homogeneous vegetation, uniform topography and soil, freedom from erosion, 
mowing and grazing or other disturbing factors.’ The Pélya-Aeppli distribution gives the 
best fit. The 7' and U statistics both indicate that a better fit would be given by a more skew 
form of distribution than the Neyman Type A. 

The analysis of the insect counts given in Table 3a will now be considered. The negative 
binomial distribution gives a satisfactory fit to these counts. 

When Beall tried to fit this distribution and the Pélya-Aeppli distribution (which he calls 





ice 





lls 








D. A. Evans 201 


Pélya Types | and 2 respectively), he got his parameters interchanged and was consequently 
unfair to them. Furthermore, his method of estimation is always by the first two moments, 
and this method is seen to be particularly inefficient in the case of the Colorado potato beetle 
counts (56, 66 and 76). 

In addition to the 7’ and U statistics, extensive calculations of expected values were 
carried out for the distributions in Table 3 4. Expected values were calculated for the negative 
binomial and Pélya-Aeppli distributions, fitted by both methods, but restrictions of space 
prevent the presentation of this information in full. Table 7 gives the values of the fitted 
parameters and of x? for both methods of fitting, for both distributions. None of the values 
of x? is significant for the negative binomial distribution, both methods of fitting giving 
reasonable values of x”. The total x? of 55 for the preferred method of fitting has 50 degrees 


Table 74. Statistics for the negative binomial and Pélya-Aeppli distributions fitted 
by Methods | and 2 to the insect population counts of Table 3 











Count la 2a 3a 4a 5b 6b 7b 8c 9c 10c lle 
N 120 120 120 120 2304 70 70 325 325 325 325 
tT Tt Tt t t 
au 3-08 1-45 1-15 1-41 2-17 17°55 5-58 0-66 0-16 0-34 0-26 
Tx +37-3 | -10-6 | —0-9 | +18 +3°8 — 296 —8-6 | —0-69| —0-05| —0-37| —0-07 
s.E. (T) 36 8-8 3-0 41 55 1200 89 0-89 0-09 0:28 0-11 
x? 6-8 6-2 1-2 7-0 21-1 3-7 0-12 5-2 1-43 4-2 0:33 
D.F. 7 5 3 3 18 4 2 4 i 2 1 
a3), 3-07 2-42 1-01 1-43 2-22 12-77 5-20 0-82 0-20 0-48 0-31 
Uys. +0-0 —3-1 | +022) —0-03| —0-25 +29:1 | +08 | —0-22| —0-02| —0-12} —0-02 
s.E. (U) 2-7 1:8 0-41 0-56 0-69 25 3-7 0-20 0-03 0-08 0-03 
x? 6-8 4:9 1:3 71 21-6 1-4 0-08 5-3 1-51 3-7 0-11 
D.F 7 5 3 3 18 4 2 4 1 2 1 
} 
t 

: | +56-5* | —7-3 | +01 | +33 | +14-:9* | +643 24-7 | —0-38| —0-04| —0-32/ —0-06 
s.E.(T') 27 7:4 2-3 3-1 4-4 1060 49 0-75 0-09 0:26 0-10 
x? 8-6 3-6 2-8 6-9 41-2* 50-5* 8-1* 3-8 1-30 3-5 0-20 
DF 7 5 3 3 18 3 2 4 2 1 
a?) 2-38 1-94 0-89 1-22 1:80 7:74 3-70 0-74 0-20 0-45 0-30 
Up. +2°8 —1:5 | +039] +0-:28| +1-76*| +59-8*| +4-0* | —0-11 | —0-02| —0-10/ —0-01 
s.E.(U) 1-9 1-2 0-32 0:39 0-48 5-9 1-9 0-16 0-03 0-07 0:03 
x? 5-0 2-4 2-2 5-7 45-9* 10-3* 2-0 3-7 1-34 2-9 0-04 
D.F. 7 5 3 3 18 4 2 4 1 2 1 












































Note. For the cases marked +, Method 1 is expected to be the more efficient method of fitting; for the 
remaining cases Method 2 is preferable. 


of freedom and is not significantly large. Two values of y? are significant for the Pélya- 
Aeppli distribution (counts 56 and 66), and the total x? of 77 on 50 degrees of freedom is 
significant at 1%. If, however, the large significant y? for count 55 is removed, the total 
x* is found to be insignificant at 36 on 32 degrees of freedom. A significant value of x? produced 
by a poor method of fitting only occurs once, in the case of count 7d. 

The fits by the preferred method to counts 56 and 66 for both distributions have been 
given in Table 7B. The negative binomial distribution gave similar results for the other 
insect population counts. This table also illustrates a persistent feature for the counts as 
a whole, which is that an observed cell frequency is usually either greater than, or less than, 











202 Experimental evidence concerning contagious distributions in ecology 


the expected frequency for both types of distribution. Only very rarely is the observed 
frequency greater than one expectation and less than the other. 

The negative binomial distribution gives a satisfactory fit to Marshall’s first moth egg 
count, given in Table 4. The values of 7 and U are both greater than twice their estimated 
standard errors for the Pélya-Aeppli distribution, although x? for the distribution fitted by 
Method 2 is only just greater than expectation. 


Table 738. The negative binomial (N.B.) and Pélya-Aeppli (P.) distributions fitted 
to two insect population counts 
































Count 5b Count 5b continued Count 6b 
r Obs. N.B. r. r | Obs. | N.B. P. r Obs. | N.B. P. 
0 190 185-5 237-3 12 | 29 37-7 40-0 0 20 20-0 20-0 
1 264 277-3 258-8 13 | 36 28-2 29-5 1 ll 8-9 5-1 
2 304 302-3 275:8 oe e.. 20-9 21-5 2 12 10-7 9-1 
3 260 288-6 268-2 15 |; 12 15-4 15-5 3 
4 294 256-0 245-1 a 11-3 11-1 4 
5 219 216-7 213-9 17 6 8-3 7-9 o| 8 9-5 10-8 
6 183 177-6 180-1 18 10 | 6-1 5-6 6 
7 150 142-1 147-3 19 6 | 1-6 6-6 7-10 6 7-4 10-0 
8 104 111-6 117-6 20 11-18 8 7-4 10-0 
9 90 86°4 92-0 jover20| 11 | 8-2 5-8 over 18 5 6-1 4:9 
10 60 66-2 70-7 
ll 46 | 50-2 53-5 x. | | 21-1 | 412%] x32 1-4 10-3* 
| | 




















The negative binomial distribution is possibly the appropriate distribution for Marshall’s 
second count. Both 7 and U are significantly positive for the Pélya-Aeppli distribution, 
and the values of these statistics are insignificant for the former distribution. Expected 
frequencies were found for the two distributions fitted by Method 2, and both gave signifi- 
cantly large values of y*. Comparing the observed with the expected series, there is a 
suggestion that even values of r have been favoured at the expense of odd values. The 
negative binomial distribution was also fitted by maximum likelihood, giving a value of 
36-5 for x? with 18 degrees of freedom. The same feature was again suggested; two-thirds of 
the value of x? being contributed by large positive differences, observation — expectation, 
for cells 0, 8 and 10, and large negative values for cells 1 and 5. Marshall mentions that the 
‘zero’ and ‘few’ egg frequencies may be in error because of the minuteness of the eggs. 


Part II. MATHEMATICAL FORMULAE 


2-1. Introduction 
The application of two methods of fitting and the corresponding tests of departure from the 
fitted form of distribution are considered for each of the three contagious distributions 
in turn. 
The two methods of fitting are: Method 1, the use of the mean and variance, and Method 2 
the use of the mean and proportion of zeros. The usefulness of the latter method as an 
alternative to the method of moments for the negative binomial distribution has been 














D. A. Evans 203 


pointed out by Anscombe (1950), whoalso gavethe criteria, 7 and U, of thetests of departure, 
with particular reference to the negative binomial distribution. These criteria have been 
described above in the first part of the section on the analysis of the data (p. 196). 
Some results for the Poisson limiting form of the three distributions are given after the 
following section on notation. 
2-2. Notation 

is the total number of observations in the sample, 

is the number of observations equal to r (r > 0), 
1» k,,k, are the first three k-statistics, 

is the probability of observing any non-negative integer r, 
4K, denote the ith cumulant and the ith factorial cumulant, and are defined by 


Fs, Be Ror >» 


nze}= 5“, nB(+t}= Pe 5 Set 
i=1 U i=1 

The estimates of a parameter by Methods 1 and 2 are sical by attaching an 
appropriate upper suffix, in brackets. For example, and d® denote the sample estimates 
of the parameter a by Methods 1 and 2 respectively. It may be noted here that both 4 
and 4 should be non-negative. If either of the estimates is found to be negative, the value 
zero is arbitrarily assigned to it, implying that a Poisson distribution is being fitted. To 
distinguish the three distributions, negative binomial, Pélya-Aeppli and Neyman Type A, 
suffix initials, for example dV, 4, and 4%, , are used when this is required. 

2-3. Poisson limiting form of distribution 

All three distributions have mean m and variance m(1+qa), and they all tend to the 
Poisson form as a0. The values of the variances of the various estimates and statistics 
have the same limiting values for all three distributions, and these are given below. The 
large-sample variances and covariances have been obtained by treating m” — m, 4 —a, and 
4 —a as infinitesimals: 

Nvar(m)~m, Nvar(é%)~2, N var (@)~4(e"—1—m)/m?, 
Ncov(m,4)~0, Ncov(m,dé®)~0, N cov (dM4®) ~ 2, 
N var (T)~ 6m’, WN var(U)~ 4{e™—1—m— 4m}. 


2-4. Negative binomial distribution 


+a) fer) farts) eo. 


7) 
ae, «Ky, =a(l +4) = Kp Kip) = (r—1)! mar, 





r A80r 
K,=m > 
s=1 8 


Estimation by Method 1 





tm =k,, a = (k,/k,)—1, 
N var(m) = m(1+a), N var (4) ~ 2(1+a)?+a(1+a)(2+3a)/m, Ncov(m,a)~a(1+a). 
Estimation by Method 2 


m=k,, 4 is given implicitly as 
Gn (1 +4) = k,/In (N/ap), 


“+) 


N var (4) ~ — +(1+a)?a%\(1+a)™/4— 1-7 [maa (1 +a)In(1+a)}, 


N cov (m, G) ~a(1 +a). 











204 Experimental evidence concerning contagious distributions in ecology 
Let k,/In (N/n,) = c, then &® is the root of the equation 
f(a) = a-—cln(1+a) = 0. 


A unique non-negative solution exists if c>1. An iterative formula based on Newton’s 
method of tangential iteration is 
_(i+a)in(1+a,)—a, 

l+a;—c 


Aart 





where a, is monotonic decreasing towards &® if ay is chosen so that f(a,) is small and positive. 

A one-page table of a/In (1 +a) = c makes the task of finding 4 much easier. Table 8 gives 
values of #/a for log,)c = 0-05(0-01) 0-65. Linear interpolation may be used for log,,c > 0-20, 
giving values of a correct to four significant figures with a possible error of one unit in the 
last place. 


Table 8. ~/a in terms of log, 9c, where a is such that a/In(1+a) =c 





















































logy9¢ da A’ | A” | logioe da A’ | logyoe da A’ | logic da A’ 

ee + + + 
0-05 | 0-6330 0-20 | 1-1087 0-35 | 1-4778 0:50 | 1-8445 
0-06 | 0-6770 ot 37 | 0-21 | 1-1344 roe 0-36 | 1-5019 ao 0-51 | 1-8696 ae 
0-07 | 0-7173 | 308 | 27] 0-22 | 1-1598 | 254 | 0.37 | 1-5260 | 241 | 0.52 | 1-8948 | 352 
0-08 | 0-7549 | 378 | 23 | 0-23 | 1-1850 | 39° | 0.38 | 1-5502 | 242 | 0-53 | 1-9202 | 304 
0-09 | 0-7902 | 39% | 17} 0-24 | 1-2100 | 589 | 0-39 | 1-5744 | 243 | 0-54 | 1-9456 | 358 
0-10 | 0-8238 | 355 | 14] 0-25 | 1-2348 | 348] 0-40 | 1-5986 | 342 | 0-55 | 19712 | 520 
o-11 | 0-8560 | 37° | 12] 0-26 | 1-2595 | 347} 0-41 | 1-6229 | 243) 0-56 | 1-9969 | 357 
0-12 | 0-8870 | 359 | 11] 0-27 | 1-2840 | 345] 0-42 | 1-6472 | 243 | 0.57 | 2-0297 | 358 
0-13 | 0-9169 | 593 | 7| 0-28 | 1-3084 | 544 | 0-43 | 1-6716 | 544 | 0-58 | 2-0487 | 360 
o-14 | 0-9461 | 507 | 8] 0-29 | 1-338 | 545] 0-44 | 1-6960 | 344] 0.59 | 2-0748 | 36 
0-15 | 0-9745 | 588 | 6} 0-30 | 13570 | 543] 0-45 | 1-7206 | 246 | 0-60 | 2-1010 | 38% 
0-16 | 1-0023 | 378 | 6] 0-31 | 1-3812 | 349] 0-46 | 1-7452 | 248) 0-61 | 2-1274 | 264 
0-17 | 1.0295 | 272) 4] 0-32 | 1-4054 | 342] 0-47 | 1-7699 aay | 0-62 | 2-1540 | 366 
0-18 | 1-0563 | 538 | 4] 0-33 | 1-4296 | 547 | 0-48 | 1-7947 | 348] 0-63 | 2-1807 | 387 
0-19 | 1-0827 200 4] 0-34 | 1-4537 | S41 | 0-49 | 1-8195 | 348) 0.64 | 2-2076 | 398 
0-20 | 1-1087 | 0:35 | 1-4778 | 0-50 | 18445 0-65 | 22346 
In order to Saaiias for subtabular values, use Bessel’s formula: 

>. oO = 0 pi 
fe=fo+ OA, ———") (48 + A‘). 


Except where given, second-order differences may nk neglected. 


The exponent of the negative binomial distribution, defined by 
k = m/a, 
could have been used instead of the parameter a. The equation for the estimate 4 is simpler 
in practice than the equivalent equation for K® = ma, It may be noted that for the 
negative binomial distribution both io = m/é™ and ® are uncorrelated in large samples 
with the fully efficient estimator m. 
Use of a combined estimate 


It may happen that neither Method 1 nor Method 2 is particularly efficient, and in this 
case the possibility may be considered of combining the two estimates 4 and 4 in order 
to get a more efficient estimate, @,, say, defined by 


dy, = &% + w(a — amy, 





Owen Cor WPS AN OR Re Nr 





ler 
he 


ler 





D. A. Evans 205 


The variance of d,, is a minimum for 
i var (4) — cov (€%4) 
~ var (4%) — 2 cov (@%4) + var (4) * 
The value of the minimum variance is 
var (4,,) = {w* var (4) — (1 — w)? var (@)}/(2w— 1), 
and for the negative binomial distribution, 
a(l+a)/,_ m+a 
m a—(1+a)In(1+a)}° 
Contours of this weighting function w are given in Fig. 1. The bounding contours &, = 90 % , 





N cov (4%, 4) ~ 





and & = 90 % given in this figure refer to the efficiencies of K® and K® as estimators of k. 
They are redrawn from Fig. 1 of Anscombe (1950). The contour @,, = 90 % refers to the 


efficiency of the estimator kv, = m/d,,, and indicates the expected increase in efficiency. 














m 
7 1 2 5 10 20 40 
wet i T ” 
5}- 6.=H% , , Jom  . 
of ° 
07 
in © cat 
6, =N% 

IP “1 
- €,=N% “ 
0-5 0-5 
0-4 ] l = ee ae | j 0-4 
0:7 1 2 5 10 20 40 

m 


Fig. 1. Contours of the weighting function w for the negative binomial distribution. 
If estimation of parameters is by Method 1, we may compute 
T = ky+k,— 2k3/k,, 
and test the significance of its deviation from zero by 
N var (T’) ~ 2m(m + a) (1 + a)? {2a(5a + 3) + 3m(1+<a)}. 
If estimation is by Method 2, we may compute 
U = ky—ky(1 +4), 
and test the significance of its deviation from zero by 
(1+a)?In (1 +a) —a(1+ 2a) 
(l+a)In(l+a)—a 
(1+a)a* 
[(1+a@)In(1+a) 





N var (U) ~ 2m(m +a) (1+a) 





+ yap! 1+a)™4 _(m+1+a)}. 











206 Experimental evidence concerning contagious distributions in ecology 


Contours of the standard errors of 7 and U for N = 100 are given in Fig. 2. The line EH 
in Figs. 2, 3 and 4 represents the contour along which the efficiencies of Methods 1 and 2 for 
estimating the parameters of the distribution concerned are equal. Method 1 is to be 
preferred in the area to the right of this line, and Method 2 to the left. Logarithmic inter- 
polation is satisfactory. For sample sizes other than N = 100 the standard errors stated 
must be multiplied by 10//N. 





m 
0-1 0-2 05 1 2 5 10 20 50 199 
50 TT T TT TTT. " a TTTTT 
& “wy E 
- 4 
1) —4{20 
<0 
70 
$ Se = 
= (~% = 
a WA ae? J 


“ 
T 
“ 
~% 
LIN 
& 
uw 


‘ 
i 
© 


TrTratr | 
AS 
S 
N 
S 
aS 
9 
6 Lj 
ov 
ow 
e 
& 
fae Sew 











0-5 \ a 0-5 
o 
= ss 
0-2 j LE LUAL | LIViti Lil Lait 0-2 
0-1 0-2 0-5 1 2 5 10 20 50 100 


m 


Fig. 2. Negative binomial distribution. Standard errors of T and U for N=100 
For other values of N, the standard error is to be multiplied by 10/,/N. 


Illustration of use of Fig. 2 


Let us take the case of fitting a negative binomial to the second count of moth eggs 
given in Table 4, using Method 2. It is seen that 


N = 782, m=k, = 5-75, & = 5-07, 10/./N = 0-358. 


The chart shows that the point (m, a) falls in the region where Method 2 is preferable and 
that for N = 100, s.£. (U) is approximately equal to 7. Consequently the estimated standard 
error of U for N = 782 is 7 x 0-358 = 2-5. The value calculated from the formula given 
above is 2-6. 


2-5. The Pélya-Aeppli distribution 
r—1 
| xs 
a 2m a 4m | mF ete 
r-ee|-sra)) ®-PLeren| (Zeca | ©» 


where X = 4a(2+a)/m. 





ag8 


nd 
rd 
ren 





D. A. Evans 207 


r 
K,p=m >, a,(r)a*—', 
s=1 


where a,(r) = z (—)ets (;) ur /28-2, 





* a(2 
Kip) = r! m(a/2)"-4, Kru = (1 +a+ a = =) ~ 


The probabilities P. may conveniently be calculated from the recurrence relation 


i 
t, = (2 + oi) at,_4 _ at,» (r > 2), 


4 
where tj=rP, a=a/(2+a) and Y = 4m/a(2+a). 
Alternatively, P, = Poo’ Ye-¥ M(r+1,2,Y) (r>1), 


where M(a, y, x) is the confluent hypergeometric function defined by 


sel ax a(a+1)x? 
M(a,y,2) = ty aes iyait 
A table of M(a, 2,2) fora = 1(1)4 and for x = 0-00 by various increments up to x = 8-0, was 
given in the British Association Report for 1927, p. 233; and more recently Rushton (1951) 
has given a table for a = 2(1)40. 


Values of the cumulative probability function car »e found from the relation 


—~ 





eee oe 2 fee R 
were het 0D 24a) eral R-r+1) (R>1), 
where [,( p,q) is the incomplete £-function. 
The following approximate relations for the cumulative sums of P, and of rP. were used 
for checking purposes: 








Tiel lite, Pri {(2+(¥ — 2)/R)a—1} Pp 
Goa <)> BS" ata ¥—2420)/R 
R-1 R-1 
and also > rP,=m > P.+(R-1) Py_s(4a)? — RPp(1 + 4a)?. 
0 0 


By writing down the log-likelihood function L and equating the quantity 


2m(1 +a) +a(2-+a) 


to zero, one obtains the solution m = k,. Because the probabilities P. are in the form of 
a series, it does not seem possible to obtain convenient expressions for 4 and its large-sample 
variance, 


Esiimation by Method 1 
We have A =k, & = (k/k,)—1, 
N var(m) = m(1+a), N var (4) ~ 2(1+4a)?+a(1+a)(2+a)/m, 


N cov (m, &) ~ fa(2 +a). 














208 LHxperimental evidence concerning contagious distributions in ecology 


Estimation by Method 2 
We have in =Iy, @® = {2k,/In(N/n,)} —2, 
, 4 2+a\? 
W var (8) ~ (*2)'{(2-4-a)8 (etme — 1) — 4m}, 
N cov (m, 4) ~4a(2+0. Ncov (4, d®) ~ (2 +a)? (m+a)/(2m). 
m 


se 0:2 05 1 2 5 10 20 50 100 
= Se ae | ty cy tT 7)°° 





20 2 20 


70 
10 4 10 
3 
ie 


VT 
(7 











5 
i 7 
a V4 2, : SR 
2. 4 <A 
2 x cs cj 2 
2.5 4 % 

Uy mes 4 <> 1 
oat C) = 
os \ s a 

0-5 =r 
bi \ ~ % ; 
2 
° 
¢ EX\G 





Ni Liviif Li) LAL 


1 


° 
[jad 
ot 
° 
Nn 
° 
w 
N 
—_ 
oO 
wv 
°S 
S 


m 


Fig. 3. Pélya-Aeppli distribution. Standard errors of 7 and U for N= 100. 
For other values of N, the standard error is to be multiplied by 10/,/N. 
If estimation of parameters is by Method 1, we may compute 
T = k,— k,(3k3 — kj)/2k4, 
and test the significance of its deviation from zero by 
N var (T) ~ 6(1 +a)? m3 + 9a(2 + a) m{(5a? + 10a + 4) m+a(1 +a) (2+a)}/4. 
If estimation is by Method 2, we may compute 


U = k,—k,(1+d), 
and test this by 
(2+a)* ¢ oink: 
N var (U)~ sia a {e2m(2+a) — ]} — 2(1 +a) (2+ a) m+ (a? — 2) m?. 
Contours of the standard errors of 7’ and U for N = 100 are given in Fig. 3. This chart can 


be used to give rough values of the standard errors, as illustrated above in the case of Fig. 2. 


2-6. The Neyman Type A distribution 





b; 


tc 


Q 


d 


D. A. Evans 209 


Beall (1940) suggests that for computation this expression is more conveniently put into 
a recurrent form. It may easily be verified that 


aes Ep 


Pa = Tai aal ru (20). 


The following formula gives a means of checking the calculations: 


rl of Ajor 
P= adi t Ay = Aij-1, where A = mea. 
r! j=1 7}! 





It was also found useful to accumulate the sums of P, and of rP., which are monotonically 
increasing towards | and m respectively. Approximate relations such as 


R R 
DrP.+(R+ 1) (1-2R) <m 
1 I 

are easy to derive. 


Alternatively, by using a table of the cumulative Poisson distribution, such as that given 
by Molina (1947), we have that 








—1 fos) : 
b> P.=1-e™ ¥ P(c,aj) (mioy 
r=0 j=0 Jj: 

© p—aj) (q4)\t 
where P(c,aj) = > om ay. 


F) 
i=c a! 


As0r 0 
=my ai a, Kus = ( +ataz) Ky, Kip) = mar, 
By writing down the log-likelihood function Z and equating the quantity 
OL 
da 
to zero, we get the solution m = k,. It is not possible to obtain convenient expressions for 
4 and its large-sample variance because of the series form of the probabilities P. Shenton 
(1949) has given an upper bound to the efficiency of the mcthod of moments. 
Estimation by Method 1 
We have m=k,, a = (k/k,)—1, 
N var(m) = m(1+a), N var (@)~ 2(1+a)?+a(2+a)/m, 
N cov (m, @) ~a. 


m(1+a)~ +a 


Estimation by Method 2 
We have d® as the solution of the following equation in a: 


1—e* In(N/n9) 
BEB aie 





The iterative formula Q;4, = ¢—ce-%, 


where c = k,/In(N/n), may be used. It is easily seen that a; is monotonic increasing or 
decreasing towards the solution @® if a) is respectively less than or greater than 4: 
| a? at m 
N G2) ~ m-e-%a__ yt 
ae m(1+a)* m%{(1+a)e-*— 12 y 1+a}’ 
N cov (m, &)~a, N cov (a, a) ~ a*{e-¢ — 1 — m}/m{(1 +a) e-*— 1}. 
Biometrika 40 14 

















210 Experimental evidence concerning contagious distributions in ecology 
If estimation of parameters is by Method 1, we may compute 
T = kz—ka+k,— k3/k,, 
and test the significance of its deviation from zero by 
N var (T') ~ 6(1 +a)? m3 + a(18 + 50a + 25a? + 2a) + 2a?(3 +a) m. 
If estimation is by Method 2, we may compute 


U = k,—k,(1 +4), 


~ 2m? 2_ (a? __* (_Utaa )| a team _ 

and Nvar(U)~2m{(1+a)?—(a [p)}-+ma{2-+a i (Gs + fale 1}, 
where a = (1—e~*)/a and £ = 1—(1+a)e-*. Contours of the standard errors of 7’ and U for 
N = 100 are given in Fig. 4, which may be used as illustrated in the case of Fig. 2. 




















m 
sf. 0-2 0:5 1 2 5 10 20 50 100, 
| . 2 TTT i | tht UJ i tTreit 
— —~ 
§ ’ 
¢ 
» % E 
20- > = 20 
CG 
10} s 410 
« 2 ‘ya 
& 2 45 
5 - D, 7 
a 2) 2. al a 
Ss yi D 
2- r CA 2 
2 ‘2 
te ? Mi 4 be 
= o a 
: S a 
0S s —10-5 
¥ Y 
- Pd 
‘ Li AA Lijit : 
0205 0-2 0-5 1 2 10 20 50 100+ 
m 


Fig. 4. Neyman Type A distribution. Standard errors of T and U for N= 100. 
For other values of N, the standard error is to be multiplied by 10//N. 


This subject of research was suggested by Mr F. J. Anscombe, whom I hav2 much 
pleasure in thanking for his helpful suggestions and criticisms. I would like to thank 
Mr D. A. East for preparing the figures for publication, and for the computing involved in 
tabulating the raw data given by Beall (1938) and Marshall (1936). I also wish to thank 
Mrs E. H. Laurie for help in computing the contours of s.£. (Up). I am grateful for the 
generous way in which Dr E. E. A. Archibald and Dr H. Barnes have made available, in 
private correspondence, fuller details of their published data, and for their permission to 
publish the material. 



























D. A. Evans 211 


The first draft of this work was completed during the tenure of a post-graduate research 
grant made by the Department of Scientific and Industrial Research. The investigations 
leading to the results given in Tables 7a and 7B were made possible by a grant from King’s 
College Research Fund. 


REFERENCES 


ANSCOMBE, F. J. (1950). Biometrika, 37, 358. 

ARCHIBALD, E. E. A. (1948). Ann. Bot., Lond., 12, 221. 

Barnes, H. & Stansoury, F. A. (1951). J. Ecol. 39, 171. 

BEALL, G. (1938). Biometrika, 30, 422. 

BEALL, G. (1940). Ecology, 21, 460. 

CrapHaM, A. R. (1936). J. Ecol. 24, 232. 

Hanson, H. C. (1934). J. Agric. Res. 49, 815. 

MARSHALL, J. (1936). Ann. Appl. Biol. 23, 150. 

or Motrin, C. E. (1947). Poisson’s Exponential Binomial Limit. 
N.Y.: Van Nostrand. 

Rusuton, S. (1951). Certain sequential tests of composite hypotheses. 
London University, Ph.D. Thesis. 

SHEnTon, L. R. (1949). Biometrika, 36, 450. 

Sreicer, T. L. (1930). Ecology, 11, 170. 

Stirret, G. M., Beaty, G. & Trmontn, M. (1937). Sct. Agric. 17, 587. 











[ 212 ] 


MISCELLANEA 


Time intervals between accidents—a note on Maguire, Pearson and Wynn’s paper 
By G. A. BARNARD, Imperial College, London 


Our object in this note is to point out the possibility of using Birnbaum’s recent (1952) tabulation of 
the distribution of Kolmogoroff’s D-statistic in connexion with some of the problems discussed by 
Maguire, Pearson & Wynn (1952) in their paper on industrial accidents. In Table 1 of the latter paper 
we are given the time intervals between accidents involving more than ten men killed for the period 
6 December 1875 to 29 May 1951. The question arises whether the data are consistent with the assump- 
tion that the accident rate over the period has been uniform. 

Let us begin by labelling i = 1,2,...,7 the m (= 109) accidents considered, in a random manner 
(i.e. not, for example, according to their order of occurrence). Since the accident labelled 7 is equally 
likely to be any one of the accidents, it follows that if the accident rate is uniform, this accident is 
equally likely to occur anywhere in the time interval considered. In other words, if x; denotes the time, 
measured from the start of the interval, at which the accident 7 occurred, 2, is an observation from 
a rectangular distribution with range from 0 to 7, the total length of the interval. And if the accidents 
are independent of one another, the x; are n independent observations from this rectangular distribution. 
If we form from these n observations the sample cumulative distribution F’,(¢) in the manner indicated 
by Birnbaum (1952), we have 


nF’ ,(t) = no. of accidents occurring at or before time ¢, 

while for the theoretical cumulative we have 

nF(t) = n.t/T, 
and so if D, denotes Kolmogoroff’s statistic, 
nD, =n Max | F,(t)—Fi(t)| 
0<t<T 
= Max | (no. of accidents up to time t)—nt/T |. 
0<t<T 


To estimate nD,, from Maguire et al. Table 1, we notice that the time intervals are, with very few excep- 
tions, below expectation until the 38th accident, which occurred 4191 days from the beginning. At this 


pant ee have 109 . | F'199(4191) — F(4191) | = 38—(109 x 4191/26263) 
= 20-6, approximately. 


Hence nD, is at least 20-6 for our data. Now according to Birnbaum the approximation 1-6276 ./n is 
good for the 1% point of nD,, provided n is larger than 80. We find 1-63 ,/109 = 17-0, approximately, 
so that the 1 % point is exceeded, and the data show a significant departure from the hypothesis tested. 
A lengthier calculation shows that nD, is about 25, in fact. Either the accidents are not independent 
of each other, or the accident rate is not uniform. 

We may now wish to examine the hypothesis that the expected number of accidents in the interval 
(t,t+dt) instead of being Adt, where A is a constant, is A(t) dt, A(t) being a function of t. We retain the 
assumption that the accidents are independent of each other. If we introduce a new ‘time’ variable u 
by the condition du = A(t) dt, it is evident that the accident rate will again be uniform referred to u-time. 
Thus the argument given above applies, with the result that we now have 


t 3 | 
nD, = Max | (no. of accidents up to time t)—n i) A(x) dx / i) A(x) da | . 
0<i<T 0 0 
If, for example, we try A(t) proportional to e-*t, the second term within the modulus sign becomes 
109 . (1 — e-**)/(1 — e-26263%), 


and we find that nD, is well within the 95% limit (given by Birnbaum as 1-3581./n) if we take 
k = 1/20000. This corresponds to a logarithmic decrement of about 1-8 % per annum in the accident 








Miscellanea 213 


rate. Thus the data are consistent with the possibility of an exponential decrease in the accident rate. 
Corresponding to any given probability level there will evidently be a highest and a lowest value of k 
which will be consistent with the data, at this probability level. These two values of k will give a con- 
fidence interval for k, assuming the decrease to be exponential in form. Similar arguments wil! apply 
to any other parametric family of decremental curves. 


REFERENCES 


BrrnBpaum, Z. W. (1952). Numerical tabulation of Kolmogoroff’s statistic for finite sample size. 
J. Amer. Statist. Ass. 47, 425-41. 

MacutireE, B. A., Pearson, E. S. & Wynn, A. H. A. (1952). The time intervals between industrial 
accidents. Biometrika, 39, 168-80. 


Further notes on the analysis of accident data 
By B. A. MAGUIRE, E. 8S. PEARSON anp A. H. A. WYNN 


1. We are very glad that Prof. Barnard (1953) has drawn attention in the preceding paper to the 
fact that the Kolmogoroff test may be used very simply to establish departure from randomness in a 
ries of events occurring in sequence, either in time or space*. The figure reproduced below illustrates 
diagrammatically the application of the test to the accident data from our earlier paper (Maguire, 
Pearson & Wynn, 1952) used by Barnard. If t;, measured in days from 6 December 1875, represents the 


ime of occurrence of the ith accident following that which occurred at ¢ = 0, then the 110 cumulative 
ample points (¢;,7%), ‘= 0,5; , 109, have been plotted (section A—B of the chart). A central, con- 
tinuous line joins the first and last points (0,0) and (é,,), where n = 109, t19, = 7’ = 26,263; this is the 
theoretical cumulative line. Two parallel ‘control’ lines have been drawn on each side of this line at 
distances (measured parallel to the axis of r) of + 1-6276 (n= 17-0 forming a significance belt. 


2. The Kolmogoroff theorern states that if the accidents have occurred at random with constant 
xpectation during the period 7’, then the probability is approximately only 0-01 that the track of the 
oints (t;,7) will pass outside this belt. The points cross the upper limit of the belt where 


Liatlive } 
2 1 


3931. Birnbaum (1952) has shown that when n> 100 there is good agreement as far out in 
uil as the 1% point between the true and the limiting distributions of the Kolmogoroff statistic. 
f 4 ~ " 7 1 
{f we may assume that the approximation is also adequate at more extreme limits, it is possible to make 


»f Smuirnoff’s (1939, 1948) table of the limiting probability integral. This shows that 











nD, maximum | nt; T-— 
’ > ~ ’ 4 > } f, R; y 
vhich is 24-6 at 2 3, is also significant at the 0-001 % level. It is quite clear, therefore, as Barnard 
1 Cc 1 } . - mx ] 
that there is evidence of very significant departure from randomness, the average interval 
} i i iu y th period 
I following up o ) . points cor rning m is of analysis suggested by Barnard’s 
" " " , " ° 1 " 9 lo } 
t is perhaps desirable to 1 few words apout t pa cular accid nt data of this example The 
} £ ~) 4 * “a “Ee FR [ed *: 
I aken from Table 1 of our 1952 paper and sho he intervals between mining accidents in 
I in due to explosions involving more tl f i 1, during the 75-yea d 1875-1950 
; - ct} 
troduced to illu t in pointsin our di on—tor example I se of the following 
pothe 1a ld 1 l ) tel I Va I t Ul npling 
1 tial popu 
t t Y i | ] 
t t pene ariance) applied to s l 11S, 
( Fist g-t ) d o1 the longest interval to the a 1ge interva 
Neither test established a significant departure from the exponential law. 
: Although we might and perhaps should have prov led further, we did notin fact att mpt to analy 
‘ i) s 1 . . 1 
these data further in the paper, i.e. we did not exami r changes in accident expectation within the 
riod. Ther lt brought out by the K og ff t namely, that the accident intervals were shorter 
b fs in at the end of the period, can also be established quite simply by using another test 
7 = <> TL 7 reaw hvwy tha 
rl V Lave (M eal ¢ 1.19 2, p. 172) but did not | I data. If 7 , 1S the time ec vered by the 
fir 54 ax lents and the time covered by t 54, ther 
’ ri ~ HAO ry r se Q 
fy t54 — bo OUse, Lt 10 ts, = 16864. 
* Some remarks of Bartlett (1949, p. 216) are also of interest in this connexion. 














214 Miscellanea 


If the accidents have occurred randomly with constant expectation, T,/T, should be distributed 
as a variance ratio, F, with degrees of freedom v, = vy, = 108. But F = 16864/8042 = 2-10, which is 
significant at the 0-1 % level. 


5. Tostudy the changes that may have occurred in the risk of explosions in mines would involve the 
consideration of a very complex story, and it would be out of place to make the attempt here. The 
following summary, however, gives a broad picture of what seems to have occurred. There appears to 
have been a substantial improvement in mine safety from 1870-90, and the average gravity of explosions, 
as judged from the number of casualties per explosion, was less in the period 1920-50 than between 
1890-1920, but the frequency of explosions was not reduced; on the other hand, there may have been 
some increase in the risk of explosion in recent years in relation to coal output, the number of men in 
employment and the number of mines being worked, but this has not been fully investigated by the 
authors. There has been a considerable reduction in mining accidents in some other important categories. 


























































































































160 
D 
A 
140 Theoretical cumulative line i orn 
and 1% limits for nD, Ye if 
For actual period AB —_—_—— he a B 4! 
120 For period CD 20 accidents er eS >, BoA 
later in cycle ly j 
Ys oa Ue 
iyi ~?T 
\~ Fra ‘Y. |v vy 
100 j SY | eS ie 
yee | 
2 a te 
5 a ve a 
3 00 t-<C4cd 
Y : 
< 4° Ps 
A Os ale a a 
at Sake” li ” 
60 >t LAA LZ 
PP : y al LAL pi ee 
wane ea 
40 CZ ay | A | 
BE gt th a 
aah Fgh ae a 
ry th 
20 A, / x wa 
Fam 
PAA 
0 2 4 6 8 0 2-4: 6 3  @ 22 «24 26 «=. 28 30 











Days (thousands) 


6. As Barnard has pointed out, the Kolmogoroff test is likely to be efficient in detecting a monotonic 
change in the risk A(t) whether this is of exponential character or follows a linear trend. Under these 
conditions, the plot of the cumulative points (¢,,7) moves away from the cumulative line and then comes 
back again in a single broad sweep. If A(t) does not change monotonically the position may, however, 
be very different, since the track of the cumulative points may now cross the theoretical line several 
times and never reach the boundary of the belt. For example, under certain conditions A(t) might 
fluctuate as an autoregressive function; we might then arrange the sequence of intervals in a closed 
cycle and look for a test for detecting changes in A(t) which is independent of any particular starting- 
point in the cycle. 


7. In the diagram we have introduced this idea by adding the Ist, 2nd, 3rd, ..., etc., intervals after 
the 109th. If now we apply the Kolmogoroff test to the stretch CD instead of AB, i.e. to the 110 accidents 
of the cycle starting at the 20th accident of the original series, the expectation line and the two parallel 
0-5 % limits are as shown by broken lines. The track of cumulative sample points now never passes 





oeeeo wo 


~ ss a. | 





Miscellanea 215 


outside the 99 % belt and significance would not be established by the test. The maximum value of 
| nt,/T’ —7| is now 13-3, which (using the limiting distribution of Smirnoff) falls near the 8 % significance 
level. 


8. Although the calculation is somewhat laborious with n so large, it is of interest to consider the 
application of a further test which is allied to that of Koimogoroff, namely, the w?-test developed by 
Cramér (1928), von Mises (1931) and Smirnoff (1936). If F(x) is the continuous cumulative frequency 
function of a random variable x specified by the hypothesis tested, and if 2,, 2, ...,%, are n observed 
values of # arranged in ascending order of magnitude, then 


ee 1 
~ 12n% 








is 2¢— 1)? 

- = F(x,)- ‘ 1 
w +3 (x) om (1) 
For the case where the distribution of x = ¢ is rectangular in the interval (0,7), F(z;) = F(t;) = t,/T 
and we may write l 12 

2 = —+— nt,;/T —(i—0-5)}. 2 

na? = 39, +75 2 (nt —(i—0-5)} (2) 

Thus, while the Kolmogoroff test uses the maximum value of n times the difference ¢,/T'—i/n, the 

w* test uses the sum of squares of the n differences d,; = t;/T’—(2¢—1)/(2n). It will be seen that if 

the pattern of the 7 events is reproduced on a line of unit length, then d; is the amounts by which 
the ith point is displaced from the corresponding point in the regular series 


1/(2n), 3/(2n), ... (2i—1)/(2n), ... (2n—1)/(2n). 


Thus if denser concentrations of points alternate with stretches of rarer occurrence, w? may increase 
significantly above expectation. 


9. The first two moments of nw? have been known for some time, but we are indebted to Mr B. A. M. 
Thomas* for expressions for the 3rd and 4th moments about the mean. Thus we have for the first four 
moments: 








Neh 1 ny 4n-—3 
4 — 6’ be —_ 180n > 
_ 82n?—61n +30 _ 496n3 — 1582n? + 1671n — 630 (3) 
ee ae ae 75600n8 ’ 


The distribution is very far from normal. It will be found that when n > co the limiting values of the 
moments agree with the values of the cumulants of the limiting distribution of nw* given by Anderson 
& Darling (1952). The use of these moments in deriving an approximation to the distribution of nw? in 
relatively small samples requires much fuller consideration, but the following values of the standard 
deviation and the moment ratios £, = 43/u3 and £, = 44/43, which we owe to Thomas, show that when 
n = 109 the distribution of nw? must be approaching the limiting form: 








n S.D. Ay Be 
10 0-143 5-53 11-23 
20 0-146 6-03 12-24 
50 0-148 6-33 12°87 

100 0-149 6-43 13-08 

1000 0-149 6-52 13-26 

foe) 0-149 6-53 13-29 




















We shall therefore use below the percentage points of the limiting distribution tabled by Anderson & 
Darling (1952, p. 203), in particular 


5% 4% 1% 01% 
0-461 0-499 0-743 1-168 


* The results were included in an essay presented as part of the Examination for the B.Sc. Special Degree 
of the University of London (1952). 








216 Miscellanea 


10. We have now calculated nw? from equation (2)* for the n = 109 accident times t; considered 
previously, (a) starting from the zero point at 6 December 1875 and (b) from the date of the 20th accident 
in the artificial cycle described in para. 7. We then find for (a) and (b): 

(a) nw? = 1-541, which is a value far beyond the 0-1 % significance level given above; 

(b) nw? = 0-474, a result significant at the 5% (but not at the 4%) level. 

[t follows that for the present data, in its original form, both the Kolmogoroff and w?-test give very 
clear evidence of departure from randomness. When, however, the intervals are arranged in cyclical 
form and the analysis starts at the 20th accident, the former test does not, and the latter does, establish 


significance at the 5% level. 


11. It would not be legitimate to make any general comparison of the two tests on the basis of these 


special results. One broad conclusion, however, may, we think, be drawn. As one of us has emphasized 
before (Pearson, 1942), while it seems attractive to tr: 





sform a statistical problem into one of testing 





whether a sample has been drawn from yn, it is still not possible to determine the 
most efficient test to use for the purpose un! the type of departure from the rectanglk 
that is likely to arise. It appears that the Koln ill only be powerful in detecting certain 
kinds of variation in A(t) and a similar posit 
I RENCI 
ANDERSON, T. W. & Dartutnec, D \ 1952). Ann. V ati tatist. 23, 193-212. 
BARNARD, G. A. (1953). Biomet 0, 212. 
BARTLETT, M. § 1949). J. Roy. Statist. Soc. B, 11, 211 
BiIRNBAUM, Z. W. (1952). J. Amer. j 17, 4 
CRAMER, H. (1928). Skand. Aktua kr. 11, 13-74 ) 
Macuire, B. A PEARS( I Y Wy b. Bs A 195 j 39, 168-80 
VO Mi R LYoL) vi sche j srechy ] \ i Deu é 
p } (1942 Biometr L1—-16 
(193 CO.R. 1 i x 202, { 
N. \ 193§ Bull. M Vos 2, fa 
p 18 | t. 19, 275 
On method of estimatir biologi ! i he field 











Miscellanea 217 


However, one of my biologist colleagues has proposed that a method of estimation be devised which is 
based on observed distances between individuals and their nearest neighbours. More definitely, let 
the region be a plane area and let the co-ordinates (x, y) of the position of any individual be independent 
stochastic variables each obeying a rectangular distribution law. An individual being chosen at random, 
observe the distances to his nearest neighbours in each of the four quadrants determined by parallels 
to the co-ordinate axes drawn through his position. From a sample distribution of such distances, how 
should the total population be estimated and how good an estimate can be obtained? 

Since individuals are assumed to obey a constant distribution function over the area, I will choose 
points Q(x, y) at random from which to observe distances 4 to the nearest members of the population. 
For simplicity, I will take the region to be a square with its vertices at O(0, 0), A(a, 0), B(a, a), 
C(0, a), and I will consider only distances to individuals lying in the first quadrant with Q as origin. 





This avoids the complication that for a choice of Q the distances to the nearest neighbour in each of 
the four quadrants about Q will not be independent. Further, I assume that the population observed 
is alive and freely moving, or at any rate is such that each new choice of Q gives an independently 
observed value 





” 
y , 
i OAS OR RAT AVES Te ea ETE Te 
ni A 
(0, a)} i(a,a 
| x4 
} i 
| | 
i = 
| < | 
| \ | 
9 | 
i 5 Aed j 
| 49 
: 3 
5 i 
Ea a ee ag Saba Jhcicial 
} O | 
i 4 (x 
J i 
A 
} Pe | 
r ; 
A ? 
4 | 
} f i 
j y, ' | 
j 
d d 
om - - ——-t- -— > 
U A 
, 0) 
} > } 4 
i | t ) i i I Ul il A Ie « A vl 
1 und 1 n wier OF l 
1 y , 
l I LOW if the tal 
' 
; 
( 
J 
i + 
n 1 
id 
{ 1 . 4) 
\ 
On summing ti ; i} o’s, from » a, the total i yur j ase (1 
' ; 
j 
‘ i 
' 1 
VV LT I I I 
f 
t . 4 
) I t | " be 
Mf ) i iw 
f +} f Qn " 














218 Miscellanea 


Now a sample of 6’s obtained for N random choices of Q will contain k, 6’s less than or equal to the 
distance from Q to the nearer of the right and upper boundaries, i.e. belonging to case (1), k, 4’s 
belonging to case (2), k, 6’s belonging to case (3), and k, instances in which no é was observable. To 
attempt to use all the information in the sample requires that a rather complicated likelihood function 
be maximized. However, the short table of values of P,;(M) given shows that, if M is more than 50, the 
use of only the portion of the sample in which the 6’s are not greater than the nearer of the right and 
the upper boundaries utilizes on the average some three-quarters or more of the total information 
available. 


M PM) M PM) 
10 0-5058 50 0-7442 
20 0-6216 100 0-8134 
30 0-6804 200 0-8652 


The likelihood function based on the k, observations in which é is not greater than the smaller of 
w and z for each Q chosen is, from (4), 


r= M(B)" (2) 


i=1 20% \" 4a? 
Maximizing log P with respect to M, I get as the maximum-likelihood estimate of M, 
A ky no? 
M=-k, 2% og (1-F). (6) 
The asymptotic variance of M is a M? 
a —? 


A rapid method for estimating the correlation coefficient from the 
range of the deviations about the reduced major axis 


By C. H. LEIGH-DUGMORE, Dunlop Research Centre, Birmingham 


In many studies of organic correlation (in palaeontology, for example) it is useful to fit the allometric 
growth line of Huxley (1932) by the reduced major axis method of Kermack & Haldane (1950). This 
method has three advantages: first, it is more appropriate than conventional regression analysis 
because the biological variability of the material is usually much greater than that due to errors of 
measurement and because the terms ‘dependent variate’ and ‘independent variate’ have no real 
meaning; secondly, the reduced major axis is invariant under both rotation of axes and change of 
scale—the latter is more important; thirdly, its computation involves only the sums and sums of 
squares of the two variates and not the sum of products. When large samples are being studied (as by 
Parkinson, 1952) of individuals each with several measured characters such a reduction in computation 
is a great advantage. Kermack & Haldane give formulae for the sampling variances of the estimates 
of the allometric growth constants obtained by their method. Each formula involves the correlation 
coefficient so that, if tests of significance of the growth constants are required and the product-moment 
correlation has to be calculated this advantage in computation is lost. 

When the distribution of the variates, or, in studies of allometry, of their logarithms, is homo- 
scedastic and nearly normal (this is often the case), an estimate of the correlation coefficient can quite 
simply be made from a consideration of the range of deviations among the scatter of points about the 
reduced major axis. 

Following, as closely as possible, the notation of Kermack & Haldane (1950) let x, y be the two 
variates under consideration and suppose their logarithms, X, Y, to base h to be normally distributed 
(in practice h = 10, usually; decimal are more convenient than natural logarithms). The equation to 
the reduced major axis of the X, Y correlation surface (the Kermack & Haldane line) is 

Y- ¥ 8y 


We will rewrite this in the form 
q= Y-—2 2+ 7 x, (2) 
&x &x 








[—loank 


ran 


a ae YS SS CPUS 





Miscellanea 219 


in which 7 is the estimate of Y from (1) for any given value of X. This may seem artificial, since the 
Kermack & Haldane line is not intended for estimation in the sense that the regression line is used; 
9 is introduced for convenience in the following argument. 


In (2) y—*¥ ¥=logb, (3) 
8x 
*F =a, (4) 
8x 
where a, b are estimates of the parameters a, # in the allometric growth equation 
y = far. (5) 


The sum of squares of the deviations of the Y-values from the line (2) is 


S(¥ —7)? =S(¥—Y-—a(X—-X))? 


= 2(1—r) S(Y— Y)3, (6) 
where r is the correlation coefficient of the sample of X, Y. Hence 
Bt S(Y —7)* 
r=1- sy (7) 


provides an estimate of the correlation, given the sums of squares of the variate Y about the 
Kermack & Haldane line and about the mean value, Y. 

In practice, S(Y — Y)? will already have been calculated and used to find a. It remains to estimate 
S(Y —7)?. 

When the distribution is reasonably homoscedastic and normal, an estimate of this sum of squares, 
sufficiently reliable for many purposes, can be obtained from the range of the sample of deviations 
about the fitted line (2) using a table of the mean range in different-sized samples such as Table XXII 
in Tables for Statisticians and Biometricians, Part IT. 

To apply this method the data are plotted, usually on logarithmic graph paper, the extreme 
deviations are found by inspection and an estimate of the correlation coefficient is easily calculated as 
in the following example. When a desk-calculating machine is not available the range method is 
undoubtedly more speedy than calculation of the product-moment correlation. For machine calcula- 
tion it is necessary to look up the logarithms of the data; for the range method the data have to be 
plotted. If a plot is necessary for other purposes the range method has the advantage in speed; if 
a plot is otherwise unnecessary it may still have the advantage depending on the sample size and the 
fineness of any grouping of the data. 

As an example the method can be applied to the logarithmic data of the Micraster cor-anguinum 
population, used as an illustrative example by Kermack & Haldane (1950). 

If the logarithms of these data are plotted (to base 10) and the line (2) fitted (calculated from the 
logarithmic data) the greatest positive deviation from the line is 0-081 and the greatest negative 
deviation — 0-125. Hence the range of the deviations is 0-206 and, since the mean range of samples of 
338 from a normal population is about 5-83 times the standard deviation, we have 0-0354 as an estimate 
of the standard deviation of the deviations about the line. Substituting this and Kermack & Haldane’s 
value for s%, (= 8s? when corrected to logarithms to base 10) in (7) we have a value of r = 0-8839 to 
compare with 0-8865 + 0-0116 given by Kermack & Haldane. 

The method has been used in other examples by Dr D. Parkinson and the author and has been 
compared with the product-moment correlation. The agreement has been found satisfactory and the 
differences have not affected the results of tests of significance. 


REFERENCES 


Huxtey, J. 8S. (1932). Problems of Relative Growth. London: Methuen. 

Kermack, K. A. & Hatpang, J. B.S. (1950). Organic correlation and allometry. Biometrika, 37, 30. 

PaRKINSON, D. (1952). Ontogeny and phylogeny of the Lower Carboniferous Brachiopod Schizophoria 
resupinata. Abstr. Proc. Geol. Soc. no. 1484, p. 52. 














220 





Miscellanea 


The effect of overlapping in bacterial counts of incubated colonies 


By C. MACK, Shirley Institute, Manchester 


INTRODUCTION 


To estimate the number of bacteria present in a given atmosphere a sample of the air is drawn over 


a plate of suitable material in such a manner that the bacteria are deposited thereon. The plate is then 


incubated and the bacteria multiply and form colonies. However, colonies which are close together may 


eC 


oalesce 


or on 


type of bacteria may inhibit the growth of another type by antibiotic effect so that the 


unt of colonies is an underestimate of the original number of bacteria present, and this paper is con- 


» correction to be applied in such circumstances. Armitage (1949), in a paper on over- 


article counting, mentions the similarity of the bacterial colony overlap problem but does 


n detail. However, there are some differences in the two problems, e.g. the colonies may 


tial fraction of the total area of the plate, the dust particles usually do not; furthermore, 


+3 are of small but finite size often of different shape 


rned with the 
lapping in dust } 
not deal witn it 
over a substan 

>dust particle 
good ipproxima 

in l rl 
Armitage estimeé 
rT a, on th t 
lepending i ti 
« I n I 
a reject iv 
orward exa 
probabd un 

’ fc ul 

unptions, 

s 


, while bacteria are, before incubation, 


tions to mathematical point 
rther difference between this paper and Armitage’s in the approach adopted. Thus 
utes by the first few terms of an expansion the number of ‘clumps’ which would be 
; I I 
rage, by pa 1ced n the plate, the convergence of the expansion 
1e ratio of dust particl rea the total area. Though I have given formulae for the 
gregates’ formed by poi i vt random (Mack, 1948, 1949) this direct approach 
ur of treating the | , é verse probability. By this means straight- 
tor th nost prot ri ) I teria and the po te rior probability of the 
* bacteria btained appropriate t ich particular count. Lidwell (1948) has 
m« ble number, | he analysis is based on admittedly over-simplified 
1] r li 3 ) \ 
} ‘ 4 : ? + +1 * " 
paper may also | lied t ist urbic yunting. Being exact, they avoid 
ay i tl it hirst tew t I [ ¢ ~ hare appli able to counts covering 
I I 
) , - ' 
ea 1, the area of each of these, A being 
ia if v ) »>m bacteria fell originally and 
one t have fallen in a,, one 
where in the areas ay, d.,...,@,.* 
l rren¢ i 
] | j s 1 
" ' — ) 
4 | \ { 
. wri . io , { 
é rit OS! é ft mm wt 
J 
’ ' 
(9 
(2) 
| l (3) 
I f f ( t tl I 
t , | t yo the I I | umber of 
‘ 
| } . . 
(4) 
I I é t than the ‘ares ifter incubation 


) leri | ising tl iple of rmaximum likel od 





vwoUel ho 


oO 








Miscellanea 221 


A similar argument to the above shows that if the prior probability were proportional to (a) 1/m, 
(b) (m+ 1), then the most probable number would be, in case (a) [(¢g—7)/(1—1r)], and in (6) [(¢+7r)/(1—1)]. 
Since q is usually of the order of 100 the differences are negligible in practice, so that the reasonableness 
of the inverse probability treatment is justified. 


3. If there are several different types of bacteria, then each type may be treated separately. Thus, if 
there are qg, colonies of type 1, formulae (3) and (4) will hold for the most probable number of type 1 
bacteria and its posterior probability with q replaced by qg, and where r is the ratio of the type 1 areas 
plus any area in which their growth might be inhibited to A. 


4. The assumption of randomness may have to be modified if, as in some kinds of apparatus, the plate 
rotates under a rectangular slit and points on it at a distance R from the centre of rotation have a 
probability of receiving bacteria which varies as 1/R. All that is required, however, is that each of the 
areas dj, ...,@, Shall be suitably modified (multiplying by 1/R9, where R, is the distance of the centre of 
gravity of the area from the centre of rotation, will be sufficient unless it is a large area). 


5. If qr is large it is necessary to calculate f(m) from (4) for quite a number of values of m before 
obtaining a clear idea of this posterior probability distribution. However, it is possible to represent 
f(m) approximately, if gr > 10, by a continuous normal distribution with a maximum at 7 and a standard 


deviation (qr)i(l—r). (5) 
For, replacing log (m!) by log I'(1+m) and this, in turn, by its asymptotic expansion 
(m+ 4) logm—m-+ 4 log (27), 


we find that log f(m) =m log (mr) —(m—q) log (m—q) + ar aes Boa +constant, (6) 
dlogf(m) . 

a + log (mr) —log (m—q) +5 amyl (7) 

@ logf(m) _ +[<- (8 

dm? * Lm m= al = Foner =a eit , 


(7) shows that f(m) has a maximum when 


oe ey 
eat PE! Tal ( ) Mp, Say. (9) 
For this value of m, (8) shows that 


d@logf(m)  (l—r)? ra 
int @ +0(5a): 





(10) 


We now compare f(m) with a distribution 
g(m) a exp {—[m—q/(1—r) + $]}*/[207}}. (11) 


(5) is now proved by equating the second derivatives of log f(m) and log g(m), provided higher derivatives 
have only a small effect. To estimate the magnitude of this effect consider the term involving the third 
derivative in the Taylor series of logf(m) about m = my. For m = m, + 2(qr)*/(1—r) we then find that 


4(1+r) 
fm) = exp| - 2+———. 3(qr)t ?+0(2)]. 


An alternative normal curve, for which I am indebted to the referee, may be obtained by using the 
fact that f(m) is the general term of the negative binomial 


1 rT —a+1) 
(; -v 3 =) 
which has a mean (¢+7)/(1—7) and a variance r(q¢+1)/(1—17)?. These may be taken as the mean and 


variance of the equivalent normal curve and the } ‘correction for continuity’ used (Yates, 1934). The 
difference between the two normal curves is not significant if g = O(100) which is usually the case. 


6. In counting dust particles, the particles are usually graded by the size of their apparent areas. 
Now smaller particles may be obscured by, or overlap larger particles, or may overlap each other to 
form apparently larger particles; thus the count of small particles is an underestimate. To calculate 
a correction to the number of particles q, say, in a particular grade of size, we find the area in which such 











222 Miscellanea 


particles could be covered by, lie on the top of, or by overlapping form part of, larger particles. Let r be 
the ratio of this area to the total area in which particles could fall. Then, by the same argument as in 
Section 2, equation (3) gives the most probable number of particles of the given grade. It is true that this 
does not tell us how many of the larger particles are really composed of several smaller ones and that the 
count of these larger particles should be reduced. Mostly such combinations will consist of one smali 
overlapping one large particle, and if the small particle is subtracted from the apparent area the particle 
will usually remain in the same grade. The reduction due to this effect is thus, in general, unimportant. 
In any case the theoretical results worked out in this paper should enable experimenters to decide what 
concentration of dust particles (or bacterial colonies) is permissible to achieve a desired reliability. 


REFERENCES 


ARMITAGE, P. (1949). Biometrika, 36, 257. 

Mack, C. (1948). Phil. Mag. 39, 778. 

Mack, C. (1949). Proc. Camb. Phil. Soc. 46, 285. 

LipwELL, O. M. (1948). Spec. Rep. Ser. Med. Res. Coun., Lond., no. 262, pp. 48 and 341. 
Yatss, F. (1934). Suppl. J. R. Statist. Soc. 1, 217. 


Non-normality in two-sample t-tests 


By D. G. C. GRONOW, Institute of Aviation Medicine 





1. In a recent note (1951) by the present author the bias and power of two-sample t-tests was 
discussed. The notation in the present paper is somewhat different to that employed before, and 
accordingly the previous discussion is summarized here using the new notation. Given two samples 
x, (¢ = 1,2,...,) and 23(j = 1, 2, selena two baa 








and v= linn 
el ms ety —1)s8 iP peck ” 
n+n’—2 nn’ 
n 1 7 : 
where au + Se, w=— 22; 
Nijm1 nN j=l 
, x , 
e=—— xX (x,—%)*, ia (x; —2’)*, 


were considered under the assumption that the populations generating the samples were both normally 
distributed, but might have different variances. The method used to investigate bias and power was to 
obtain approximations (in series form) to the moments of u and of v, to approximate to the distributions 
of these criteria by frequency curves having the same first four moments, and to find the proportion of 
the frequency curve cut off by a nominal significance level. This procedure worked reasonably well over 
the range of values considered, but the algebra involved in the calculation of the moments was heavy. 
It was clear, therefore, that the labour involved in calculating the moments of the criteria under the 


assumption that the samples had come from a non-normal population would be prohibitive. Accordingly 
another method was used. 


2. David & Johnson (1951) have pointed out that the distribution of any ratio of two random 
variables is found usually in order to be able to make a probability statement of the form 


P{x/y>Iy,} = a, 


where J, is the value of the abscissa of the distribution, the ordinate of which cuts off a proportion 
4a in the tail. This probability statement is the same as 


P{(x— Ly, y) > 0} = 4a, 


and, following Cramér, they pointed out that the distribution of z—L,, y could equally well be studied 


as that of the ratio, provided y is never negative. It will be noted that the criteria u and v can both be 
expressed in terms of k-statistics as functions such as 


(ky — kt) (ky + aks)-4, 





ss we ee lle 


OS ee 2 eS _ ee 





Miscellanea 223 


where a and 6 are different for u and for v, except where n = n’, when w=v. Thus we may make the 
statement ; 
P {| b(k, — ky) (ky +akg)-* | > Ly,} = 4, 


or P (0%(ky — Ki)? — L3, (keg +. ky) > 0} = a. 
The moments of J = b%\(k, — ki)? — Li, (hk, + aks) 


can be written down under very general conditions. If d is the difference between the population means, 
and if the dashed cumulants relate to the second population, then the first two moments of this quantity 
are 


E(J) = bar oe(“4 : “3) — Lala + aK%) 


&(J%) = b+ abe] ape( ay +S) - Lala +ax6) | 


ae A Kz aK; 
rerlels-)- Ae) 
, 2 








ne 7/8 n 
— 26°L}, ane (Kg+ ari) (+ +) ] 
a | Xa ark, Ki aa®k ‘ 
+Dia .*o sacha we) + eat ani) J. 


The third and fourth moments follow similarly. 


3. The moments calculated in §2 are quite general, the only assumption that is made about each 
population being that its cumulants exist up to any required order. It is possible, however, to simplify 
the expressions for these calculated moments somewhat if we make a further assumption regarding the 
functional form of the population sampled. In previous work it has been found that asymmetry in the 
population generating the samples has little effect on sampling distributions, but that kurtosis may 
alter nominal significance levels appreciably. We shall assume, therefore, that the parent populations 
of an investigation may be described by two symmetrical Gram-Charlier Type A series. It will follow that 


Ky =K,=Kg=K,=0 and x,=—35x} 


resulting in considerable simplification of the moments of J. We have that 


KAJ) = bears 30( "4 “)- — 13 (ky +ak%), 
wir = oro 


Ky Ky aK a [Ke KG 
+u(S4 “t) 2b? Li (ci +S} + 4.48) 


2 2,-/2 
+20¢(4 9) +2Lt,( KE + a*Kg ), 


n—-1 n’—1 








with similar, longer expressions for ~, and /l4. 


4. Having obtained the moments of J it is necessary to approximate to its distribution by assuming 
for this distribution a functional form, the parameters of which have been found from the known moments. 
Preliminary calculations showed that the momental constants #, and f£, would indicate a Pearson 
Type IV as an approximate graduation. These curves are, however, intractable to handle arithmetically 
and Johnson (1949) ‘unbounded’ type curves (S77) were used instead. Using both tails of the t-distribution, 
for the nominal test, i.e. a two-tail test, the bias due to kurtosis, in two specific cases, is shown in Tables 
1(a) and 1(b). The effect of non-normality is seen to be noticeable for both the wu and v criteria, but it 
seems to be somewhat worse for v than for uw when the sample sizes are unequal. This suggests that if the 
populations may be non-normal as well as have unequal variances, the advantage pointed out by Welch 
(1937) of using v rather than w when comparing the means of small unequal samples is no longer so 
clear-cut. 








Miscellanea 


Table 1 (a). Probability, as a percentage, that u(= v) falls beyond the 5 % points of the 


t-distribution with 18 degrees of freedom. Case n = n’ = 10,6 = 0 


(Symmetrical Gram-Charlier populations, £, = £3.) 




















Ka/Ke 
1 2 3 
Bs 
2-5 5:33 5-58 5-92 
Normal 5-00 5-13 5:30 
4-0 5°51 5°74 6-13 
5-0 5:94 6°29 7°31 
| 








Table 1(b). Probability, as a percentage, that u and v fall beyond the 5 % points of the 


t-distribution with 18 degrees of freedom. Case n = 15, n’ = 5,6 = 0 


(Symmetrical Gram-Charlier populations, f, = £3). 
























h, K3/Ke j } 1 2 
2°5 1-56 2°45 5-28 10-34 
Normal 1-40 2-25 5-00 9-77 
4-0 1-65 2-58 5-91 10-73 
5-0 1-83 2°81 6-55 11-14 
2-5 5-83 7-14 9-12 11-65 
Normal 5°47 5°85 6°83 7-99 
4-0 6°37 6-89 8-76 11-04 
5-0 7-50 9-11 11-37 13-32 























Table 2. Effect of non vasa in parent populations on the power of the u and v tests to detect a difference 





ypulation means (using the 5 % significance points for t with 18 degrees of freedom). Case 
Sua = 10, a = 0-05 


Power of u (= v) test. 




















8 | 
V4 (Ke + 9)] 
0 0-5 1-0 1-5 2-0 2-5 3- 
A, 
2°5 0-053 0-196 0-590 0-893 0-983 0-997 1-000 
Normal 0-050 0-182 0-562 0-892 0-987 0-998 1-000 
4-0 0-055 0-205 0-594 0-887 0-980 0-997 1-000 
5-0 0-059 0-216 0-587 0-883 0-978 0-997 1-000 
2°5 0-056 0-197 0-596 0-894 0-984 0-998 1-000 
Normal 0-051 0-188 0-576 0-892 0-988 0-999 1-000 
40 0-057 0-207 0-583 0-889 0-980 0-998 1-000 
5-0 0-063 0-222 0-581 0-889 0-977 0-998 1-000 
2-5 0-059 | 0-200 | 0-594 | 0-889 | 0-986 | 0-998 | 1-000 
Normal 0-053 0-185 0-559 0-893 0-987 0-999 1-000 
4-0 0-061 | 0-212 | 0-573 | 0-885 | 0-980 | 0-998 | 1-000 
5-0 0-073 | 0-224 | 0-590 | 0-879 | 0-978 | 0-998 | 1-000 



































Si 








Miscellanea 225 


By varying the value of 6, the distance between the population means, we may study the effect of 
changes in , as well as in the variance ratio «3/x, on the power of the tests, that is to say on the prob- 
ability of establishing significance. For the case of equal sample sizes, with n = n’ = 10, Table 2 shows 
the power of the test for increasing values of the ratio 


9 = Si{d(Ky + Ks)}}, 


which appears to be the appropriate non-central parameter to use. Remembering the differences between 
the effective significance levels, i.e. between the figures in the column headed 6 = 0, it will be seen that 
in this case of equal samples, the power function is remarkably little modified either by inequality in 
population variance or departure from normality. This result suggests that provided the two samples 
are equal or nearly so (as they generally will be in any planned experiment), the standard normal theory 
power function for the t-test may be used, even when the variances are unequal and /, differs from 3. 
Thus, for example, in assessing roughly how large samples must be to establish a given difference 
in population means using thet-test, Pearson & Hartley’s (1951, p. 115) chart for v, = 1 could be used with 


d=6/n/o, v~=n+n'—-2, 


where the value taken for a? would be a rough estimate of the average population variance, $(K, +3), 
based on past experience. 

When the sample sizes are so unequal that the effective significance levels differ substantially from 
their nominal value, it becomes difficult to interpret the resulting modifications in the power function. 

Computations have been made for the case n = 15, n’ = 5 and tables of the power function for both 
u and v have been determined for the same values of £, as in Table 2 and for «3/x, = 4, }, 1, 2 and 3. It 
is hoped to make use of these results in a further communication, but it may be remarked that, as for the 
significance levels given in Table 1 (b) above, the effect of non-equality of variance is far more important 
than that of kurtosis. 


My thanks are due to Dr F. N. David and Professor E. S. Pearson whose suggestions helped con- 
siderably in the preparation of this paper. 


REFERENCES 


Davi, F. N. & Jounson, N. L. (1951). Biometrika, 38, 43. 
Gronow, D. G. C. (1951). Biometrika, 38, 252. 

Jounson, N. L. (1949). Biometrika, 36, 149. 

Pearson, E. 8S. & Hartiey, H. O. (1951). Biometrika, 38, 112. 
We cu, B. L. (1937). Biometrika, 29, 350. 


Note on the Poisson Index of Dispersion 
By N. KATHIRGAMATAMBY, University College, London 


1. The distribution of the Poisson Index of Dispersion, and the validity of assuming that it is dis- 
tributed as y* with degrees of freedom equal to the number of observations minus one, has been discussed 
at some length by Hoel (1943). He reached the conclusion that the assumption of the x* distribution 
was valid for a small number of observations provided the Poisson parameter was large, but his method 
was not altogether satisfactory in that it was based on a comparison of moment ratios, these moments 
themselves being infinite series. An alternative approach has been to consider the conditional distribution 
of the Index for a fixed total, say 7’, of the observed values. Provided T is not too large the probability 
of every possible partition can be determined and hence the exact overall distribution of the Index 
obtained by allowing T' to vary according to the Poisson law. Cochran (1936) first put forward this 
method of attack, and further calculations have recently been carried out by Lancaster (1952). Neither 
dealt fully, however, with the case of more than four observations. We shall concern ourselves here 
with the case where 7 may take any value, i.e. with the overall rather than the conditional distribution 
of the Index. It is possible by means of the rewriting of a probability statement to calculate exact 
moments and hence by curve fitting to obtain probability integrals which may be compared with those 
of y*. Further, by writing the moments in a general form the power of the Index of Dispersion test can 
be obtained, when the alternative hypothesis is that the variance is greater than the mean. 


Biometrika 40 15 











226 Miscellanea 
2. It is assumed that there are N random independent variables x}, 23, ...,2y Which may take only 


zero or positive integral values and each of which has the same distribution. If this distribution is 
Poisson then the Index of Dispersion is written as 


N 
Rd (x,—2)* 
ae, 


z 
where Z is the mean, and the probability statement connected with the test of significance takes the form 
PU>xy-a,q} =% 

N 
or P{( >» (21—#)*—x4-14-8) >0} =a. 

i=1 
If k, and k, denote the first two of Fisher’s k-statistics this expression becomes 

P{((N — 1) ky — ky Xy-1,0) > O} = . 

Accordingly we focus attention on S =(N-1)ky-k, X31.» 
the moments of which can be written down exactly. Assume, for the time being, that the cumulants of 
the parent population are k,, Kg, Ks, .... We note that both k, and «x, can never be negative. Easy algebra 


gives 


&(S) =(N- 1) Kg— Ki Xy-10 

















2x2 
p,(S) = (N—1)! ae 5 |H 2 Daa the 
A ini Ky 4(N —2) 8x3 
bl(S) = (N-1P) Sat HOV “1 MW-1)?"8 Wor 
4 
—3(N-1)*y4-14 5+ Rat |+ 9a ae Shue, 
4 ky , 3N—2) 8(4N?—9N +6) 
KS) = (N—-1)*) 5 - ma Dts t N*{N—1)8 KS 
a 96(N —2) 48x4 
NW Tae 1388 (Ve =| 
ae Kr, W2egky | 12k, 8(N-2) 24K,K2 
4N-1) X9-10] Nt Na —1)*t NEN —1)* NEN-1**“** NaV—1) ven | 
Key 4K,K 4x3 





~1)?y¥4 
+6(N —1) Xy-1a) WV + na 1) N*N-1) 


—4(N 1) Xone pet Xue He: 


When the parent population is Poisson 
K=A for r=1,2,..., 

and the moments of S reduce to 

&(S) = ALN - 1 —Xn-1,a) 

N-1)? AN-1 1 
#(S) = a! N =? N ena 7p thre | +2A"N—1), 
(N-1)3 3(N—1)? 3(N-1) , 

#;(S) = af N? ba N2 Xn-1.0 gs gam N2 Xb-1, a xi Xx- 1 

+a “er 9)_ <= 1) 











Xw-1, -| e 8AN — 1), 
A 
keS) = [UN — 1) 4(N = 1) 51,0 + (N= 1) hag — UN — 1) X10 + Xora] 
A? 
+ 5 [8(N — 1) [LLN*— 27 + 17] - 32(N — 1) (4N — 5) Y-4, + 48(N — 1) X4-1,4] 


As 
+ 5; [48(N — 1) (5N — 7) -96(N — 1) x4-1,2]+ 4844 —1). 





oo We @& @& © C2 FS Oe 





->- = * DD ee 4 


Miscellanea 227 


3. Having obtained the moments it is necessary to decide on a suitable functional form with which 
to approximate to the distribution. Preliminary calculations showed that the f, and £, of the distribution 
of S lay in the Type IV area of the Pearson system. While it is natural to fit a Pearson curve, since so 
many have occurred naturally in sampling theory, the Type IV is intractable to handle. Moreover recent 
investigations have demonstrated the closeness of the probability integrals of the Type IV and the 
Johnson S,, curves with the same first four moments, and the latter system is considerably easier to 
compute. In the main therefore either the Johnson S,, or the log-normal curves were used to estimate 
the probability integral from the moments. In two cases Pearson Type IV’s were fitted as a check on 
the Johnson system and in nine cases Edgeworth’s form of the Gram-Charlier series. The difference 
between the probability integrals estimated by all three curves is not large. The results are given in 
Table 1. These estimated probabilities should be equal to a value of a = 0-05 if the x? distribution is 
a satisfactory approximation to the true distribution of the Index of Dispersion. The interdependence 


Table 1. Estimated probability integrals corresponding to a nominal x? level of 0-05 











a 1 5 10 
N S, or S; Gr.-Ch. TypeIV | S,or S, Gr.-Ch. S, or S; Gr.-Ch. 
201 0 051 0-052 0-052 0-050 0-050 0-050 0-050 
101 0-051 0-052 0-051 0-050 0 050 — _ 
51 0-051 — -- 0-050 — 0-050 _ 
ll 0-052 _— — 0-048 — 0-049 — 
6 0-055 0-050 _- 0-048 0-053 0-048 0-052 
































of N and A will be noticed from the table. For either N large or A large, or both large, as Hoel noticed, 
the'y? distribution is a reasonably good approximation, and in fact, there appears to be little risk of error 
for N as small as 6 provided A is greater than or equal to unity. Generally the larger the sample size the 
smaller the acceptable value of A, but this rule must not be pushed to extremes. I, in practice, is 
discontinuous and may only take a finite set of values for given A and N, a set which will decrease 
rapidly with decreasing A. In fact for A small discontinuities are more likely to be a source of error 
than assuming the distribution of I is x?. 

One further interesting point is suggested by these results. For any given sample size it appears that 
for A small the first kind of error is greater than that allowed for by the x? distribution. This in judging 
significance from x* tables we shall tend to overestimate significance for A small and a low nominal 
significance level should therefore be chosen. 


Table 2. Power of test when the alternative hypothesis is a compound Poisson. 
(Nominal significance level = 0-05) 




















Kyjk,=$ K/K,=2 
Type of series 
A 
1 5 10 1 5 10 
N 

Thomas 51 0-635 0-667 0-669 0-956 0-960 == 
Neyman 0-665 0-666 0-669 — _ _ 
Neg. binomial 0-606 0-659 0-666 0-898 0-949 0-955 
Thomas 101 0-870 0-888 0-890 0-999 1-000 0-999 
Neyman 0-865 0-887 0-890 0-999 0-999 _— 
Neg. binomial 0-834 0-880 0-886 0-988 0-997 0-999 
































15-2 











228 Miscellanea 


4. The Index of Dispersion is commonly used to detect departures from the Poisson distribution in 
the shape of the variance being larger than the mean. There are many population forms which can be 
chosen to describe this state of affairs; we have confined ourselves to such distributions as can be 
described by a two-parameter compound Poisson series. For thesake of example we considered three 
types of such a distribution, the Thomas, the Neyman, and the negative binomial. The parameters of 
these series can be chosen so that they all have the same mean and variance, but they will have different 
higher cumulants. Substitution of the cumulants of any one of these in the general expression for the 
moments of S enables us to obtain an approximate distribution from which we can find P{S> 0} and 
hence the power of the index of dispersion test with regard to an alternative of the selected compound 
Poisson type. Tho results of such calculations are given in Table IT. 

The power of the test appears to depend but little on the value of A (or «,) but increases, as is expected, 
with increase in sample size and with the value of x,/k,. The differences in the power of the test with 
regard to the three alternative series are not large. 


REFERENCES 


Cocuran, W. G. (1936). Ann. Eugen., Lond., 7, 207. 
Hokt, P. G. (1943). Ann. Math. Statist. 14, 155. 
LANCASTER, H. O. (1952). Biometrika, 39, 419. 


On an extension of Geary’s theorem 
By R. G. LAHA, Statistical Laboratory, Calcutta 


R.C. Geary (1936) has shown that if the sample mean is distributed independently of the sample variance, 
then the population should be normal. He assumed that the moments of all order in the population 
should exist. Later on, E. Lukacs (1942) proved the same theorem, assuming only that the population 
should possess a finite variance. 

Basu & Laha (1952) extended Geary’s theorem by proving that if the sample mean is distributed 


independently of any sample k-statistic (as defined by R. A. Fisher), say k,(r > 2), then the population 
is normal. 


Here another extension of Geary’s theorem is given. 


THEOREM. Let 2:,2%3,...,%, be identically distributed, independent random variables with a finite 
variance o?, 


If now the conditional expectation of any unbiased quadratic estimate of co? (c+0) for given 
Uy t+Xgt...+2yq owe: not involve the latter, then each x should be normally distributed. 


Proof. Let Q = > a,,;x,2, be any unbiased quadratic estimate of co* (c+ 0). 
a,j 
Then, from the Gofinition of unbiasedness, we have 


E(Q) = 
n n 
or o? Yayg+m® > as;=co*, where m= E(z), 
t=1 Pi i,j= - 
or Yay=ec¢ and YY ay=0. (1-1) 
i=1 i,j=1 


Now, from the condition of the theorem, it follows evidently that 
E{Q ithe} = E(Q) E(et#=2) 
= co*{9(t)}*, (1-2) 
where ¢(t) = E(e**) stands for the characteristic function of the distribution of x. 
Now simplifying the left-hand side of (1-2), it can be shown easily that (1-2) reduces to the form 


(3 /+) 3 x ayt+ (S/*) 2 x ay = —co*. (1-3) 


Let us now write y(t) = log d(t), where ei represents fe cumulant-generating function of the dis- 


tribution of z, then + _ dg Ig be + d*¢ at /e- (2 /s). (1-4) 





so 





Miscellanea 229 
Substituting (1:4) in (1-3) we have 


dy n (3) n 
— y+ | — ay; = —co*. (1-5) 
dt? 2 ai dt ED 43 
Now, from the relations in (1-1), (1-5) at once gives 
d? 
ah =-o?, (1-6) 


that is, y(t) is a polynomial of degree 2, which shows evidently that x should be normally distributed. 
It is interesting to note here that we need not assume the complete stochastic independence of =z and Q. 
From this theorem and Craig’s theorem (1943), it follows evidently that if the sum 7,+2%,+...+2, is 


n 
distributed independently of any quadratic form )) a,;2,2, satisfying the relations 
i,j=1 
n 
(i) ~ a,+0 and (ii) 
then eA = 0, 
where A is the matrix of the quadratic form, and € = (1, 1,..., 1). 


n 
2 ts =0 


i,j=1 


REFERENCES 
Basu, D. & Lana, R. G. (1952). On some characterisations of the normal distribution. Sankhya 
(in the Press). . 
Craic, A. T. (1943). Note on the independence of certain quadratic forms. Ann. Math. Statist. 14, 
195-7. 


Geary, R. C. (1936). The distribution of the Student’s ratio for the non-normal samples. Suppl. J. R. 
Statist. Soc. 3, 178-84. 


Luxacs, E. (1942). A characterisation of the normal distribution. Ann. Math. Statist. 13, 91-3. 


The Doolittle method and the fitting of polynomials to weighted data 
By P. G. GUEST, University of Sydney, Australia 


A modification of the standard Doolittle scheme of solving normal equations to include the calculation 
of orthogonal polynomials and standard errors has recently been described (Guest, 1950). In the present 
communication the case in which the observations are weighted and factorial moments are employed 
is considered. This has been discussed by Aitken, Fisher & Kimball, and it is the purpose of this note 
to correlate and extend their treatments. 

The least-squares curve u,(x) of degree p which fits the n observations y(x,), of weight w(x,), is obtained 
from the condition that n 
2 w(x,) {y(x;) — u,(x,)}? 


should be a minimum. We consider the case in which it is desired to express u,(z) as a linear function of 
polynomials f;(x) of degree 7 in x, j taking values from 0 to p. Then 


usa) = & bosffa ay 
where the b,; are given by the normal equations 
Sule) {y(x,) - 2 DosS lara} ful) =0 (k=0 top). (2) 
If we write 
x w(x,) y(%4) fx(%4) = My, (3) 
x w(x) f(s) fled) = Pir» (4) 
t 


then the normal equations become 


$ bos Pix — M,. (5) 
j=0 











230 Miscellanea 


The quantities ¢,, form a symmetric matrix, and the standard Doolittle or abbreviated Doolittle method 
of solution (Dwyer, 1941) may be employed. 
We can introduce orthogonal polynomials 7',(x), of degree j in x, satisfying the equations 


2 w(2,) T(x,)T (x,)=0 (j+#k). (6) 


These equations determine the orthogonal polynomials except for an arbitrary factor. This factor is 
conveniently chosen so that the leading coefficient of T',(x) and f,(x) are identical. Thus 


j-1 
T(z) =Ska)+ % an Tela), (7) 
where, using (6), 
a= 2H w(2,) Fi(x4) T (2;)/ 2 w(2x,) T2(2x;). (8) 
Writing Sy, = Dw(z,) f(r)T Ax), (9) 
Oj, = —S5p/Sex (10) 
and Sy, = x w( 5) £44) {fel % 4) + Oe, ea T e-1(24) + --- +X eq T o(%,)} 
k-1 
= Ppt XD Cem im (11) 
m=0 


The quantities S,,, «;, can thus be built up in succession from the given ¢,, and the previously calculated 
values Sjm, &pm (m<k). 
If u,(x) is expanded in terms of the polynomials 7',(z), 


ua) = ¥ ayia, (12) 
then the least-squares condition leads to the equations 
a; = M3/S44 (13) 
where Mj; = Xw(x,) y(x,) T(x,) 
j-1 
= M,+ % a;,Mj4. (14) 
=0 


The author has pointed out that the quantities S;,, a;,, Mj, a; are obtained directly in the forward 
section of the standard Doolittle method of solving the normal equations. For example, in Dwyer’s 
notation, Sj, = ;x.12,.(¢-1) and Mj = M, 1», «;-1- Further, if coefficients £,;, are introduced such that 


T(x) = E Babe (15) 
j-1 
then Br=Gpt+ LX esmBmrs (16) 
m=k+1 


and the f,, can be built up in the Doolittle table along with the a,,. 
The coefficients b,, can be written as 


bys = a+ § A psb oe (17) 
$+1 


or b5; = a;+ § Bus Qy- (18) 
j+1 


Equation (17) is used in the backwards section of the standard Doolittle solution. However, the expan- 
sion in terms of the f,,; has the advantage that the coefficients b,, in the least-squares polynomials of 
degree r (<p) are obtained by merely omitting the terms /,,a, in (18) for which k>r. The use of the 
quantities # in the estimation of standard errors has been described in the earlier paper. 


The polynomials f(z) are commonly 2/ or (‘) . Whether power moments or reduced factorial moments 


are used depends both on the type of machine available and on the form in which the least-squares curve 
is required. If power moments are used, then ¢,, = Xw(x)a/+*. Power moments are always used when 
the data do not occur at equally spaced intervals. 








Miscellanea 231 


' When reduced factorial moments are used, ¢;, = Lw(z) (*) (;) , which is not directly calculable by 
repeated summation. However, ¢,;, can be expanded as a linear function of the Xw(z) (*) . In fact 


_ & (k\ (i+k-—m x 
Pn = = (") ( k ) x — Free — 


In the method of calculation described by Fisher, equation (5) is used, with the ¢,, calculated from 
equation (19). 

Aitken, however, uses for the functions f,(z) the unreduced factorials 2 = a(x—1)...(e—j+1), 
and by linear combination converts the equations (5) into the equivalent set 


= by; Zu(z;) (2, + kyo = % w(x;) y(x;) (2, + ky. (20) 


The sums and moments occurring in these equations can be obtained directly by repeated summation. 
But the matrix of the coefficients of b,, is now non-symmetric, and the Doolittle method of solution 
cannot be used. For this reason Fisher’s method of obtaining the normal equations is to be preferred, 
and these equations should be solved by the Doolittle technique. 


REFERENCES 


ArrKeEn, A. C. (1933). Proc. Roy. Soc. Edinb. 54, 1. 

Dwyer, P. S. (1941). Psychometrika, 6, 101. 

FisHer, R. A. (1946). Statistical Methods for Research Workers, 10th ed., p. 166. Edinburgh: Oliver 
and Boyd. 

Gusst, P. G. (1950). Phil. Mag. [7], 41, 124. 

Kimpatt, B. F. (1940). Ann. Math. Statist. 11, 348. 


A simple method of deriving best critical regions similar to the sample space in 
tests of an important class of composite hypotheses 


By K. S. RAO,* Department of Statistics, University of Bombay 


1. Introduction. The theory of testing of hypotheses, developed by Neyman & Pearson in a series of 
articles commencing from 1928, is an integral part of the syllabus in mathematical statistics at various 
universities. In teaching the subject, the author of the present paper found that it was possible to 
shorten the rather formal step which Neyman & Pearson (1933) took in their paper ‘On the problem of 
the most efficient tests of statistical hypotheses’, in finding the test statistic to be used for a uniformly 
most powerful test of a composite hypothesis regarding a normal universe. By using a simple lemma 
given below, a great deal of formal algebra of the original paper can be saved. 


2. Thelemma. Thenecessary and sufficient condition that the variates X and Y may be independently 
distributed as x*0*-variates, with m and n as their respective degrees of freedom, is that the variates 
U and V defined by ax 

U=" > V=X+Y, (1) 


are independently distributed, respectively, as an F’-variate with m and n degrees of freedom and a 
x?0*-variate with m+n degrees of freedom. In other words, the sum of two independent y*o*-variates 
is independent of their ratio. 

The prooft is easy and therefore omitted. An extension of this lemma is that, if X,,...,X, are in- 
dependent y*e*-variates with m,, ...,m, degrees of freedom, then the r statistics 








r—1 
= m, 
m,X, mtm, Xs 1 ee Us (2) 
m,X," m, X,+X, ~” m, ai sie . 
x X, 
s=1 


are, respectively, r—1 independent F’-variates and a y*o?-variate and the converse. Multivariate 
extensions of the above result and their applications are given by K. 8S. Rao (1951). 


* Nuffield Foundation Dominion Fellow, 1952-3. 
t+ It will be found in a number of text-books on statistics. 








232 Miscellanea 


3. Illustration of the method. The extent of simplification that results by using the above lemma is 
illustrated by applying it to Example 10 of Neyman & Pearson (1933, pp. 328-32). The method is general 
and applicable to similar situations. In what follows, the numbers given in bold type after the serial 
number of an equation refer to the number of the equation in Neyman & Pearson’s paper. This 
procedure, it is hoped, will bring out clearly the simplification made. 

The test is that for the significance of the differerice between two variances. Neyman & Pearson 
consider that two samples, 

(1) 2, of sizen,, mean=%,, standard deviation = 8,, 

(2) X, of sizen,, mean=%,, standard deviation = 8,, 
have been drawn at random from some normal populations. If this is so, the most general probability 
law for the observed event may be written 

yy 2 (%,—a,)* +83 (%,—4s)* +93 
Jam) oe [ A, GT 





P(X yy +09 En3 Cay say +92 y) = ( |: (3, 159) 
2 

where n, +n, = N, and a, o, are the mean and standard deviation of the first, and a,, 0, of the second 

sampled population. 

The admissible simple hypotheses include pairs of sampled populations for which a,, a3, 7,>0,0,>0 
may have any values whatever. Hg is the composite hypothesis that o, = @,. This is a test for signi- 
ficance of the difference between two variances in two independent samples. The parameters may be 
defined as follows: 

For a simple alternative H,: 


aP=a, a=a—a,=b, aP=o, af =6,=90,/0;. (4, 160) 
For the hypothesis to be tested, Ho: 
aa, a=b, a®=0, af = 1, (5, 161) 


After showing that the conditions required by their theory are satisfied, they proceed to construct the 
best critical region similar to the sample space. 
In the process they obtain the surfaces 


¢, = %, = const., (6, 178) 
$ = Z, = const., (7,172) 
1 
¢; = y (msi + nas) = 84 = const. (8, 173) 
The element w(¢,, $2, ds) is the part of W(¢,, 2, $3) within which 

Py > k(Zy, Za, 8q) Pos (9,174) 

the value of k being determined for each system of values of %,, Z,, 8,, 80 that 
Py(wol Py $a $3)) = €Po( W(d,, do ¢s))- (10, 175) 


The best critical region similar to the sample space is obtained by integrating the element w)(¢,, $2 Ps) 


with respect to ¢,, $2, #3 over their ranges of possible values. By using condition (174) they derive 
that, for 


(a) 6 = a,/0,>1: the ‘best critical region’ (B.c.R.) will be defined by 


85 > ky (Z,, Zs 83), (11, 178) 
and for 
(b) 6 = 0,/0, <1: the B.c.R. will be defined by 
83 <k,(Z,, Zq, 83). (12,179) 


After devoting the whole of p. 331 and part of p. 332 to formal algebra, they obtain the best critical 
regions as follows: 


383 


For alternati O,>0;: = ———. > u,. 13, 194 

(a) For ratives o,>0; u ead Uy ( ) 
N82 

b) For alternatives o,<0;: = ago ee, 14, 195 

( ) 1 2 1 u n, 83 +7482 Ug ( ) 


However, the lemma quoted above makes the transition from (178) to (194) very simple as follows: 
Dividing both sides of (178) by the positive quantity, (n,s? +n, s2)/n, = Ns2/n, in case of (a), the 
B.C.R. will be defined by 48 


u "ad+ne Ze, 82). (15) 








Miscellanea 233 


The left-hand side of the inequality is a function of the ratio of two independent y*o?-variates, n,s? 
and n,83 which are independent of %, and Z,. From the lemma it follows that the left-hand side is statistic- 
ally independent of %,, Z, and sj. Hence any percentage point of the statistic u on an intersection surface 
given by (171), (172) and (173), is the same for all such surfaces and hence the B.c.R. is given by (194). 


REFERENCES 
NeyMAN, J. & Pearson, E. S. (1933). On the problem of the most efficient tests of statistical hypo- 
theses. Phil. Trans. A, 231, 289-337. 


Rao, K. 8. (1951). On the mutual independence of a set of Hotelling’s T?’s derivable from a sample 
of size n from a k-variate normal population. Proc. Int. Statist. Conf., India, 











[ 234 ] 


REVIEWS 


Econometrics. By GERHARD TINTNER. New York: John Wiley and Sons. 1952. $5.75. 


Quantitative economics has three main branches. First there is the expression of the laws of economic 
behaviour in a form which permits of empirical application. Secondly, there is the problem, often of 
great practical difficulty, of extracting from the observed workings of the economic system the data 
by which the magnitudes entering into these laws may be estimated. Finally, the estimated magnitudes 
must be applied to the theoretical laws in order to calculate parameters and test hypotheses concerning 
them. Strictly speaking, all three branches have their place in econometrics, but the usage of the term 
has tended to become restricted to the last of the three; it is with this branch of the subject that 
Dr Tintner’s book is wholly concerned. 

Earlier books on econometrics have dealt mainly with the economic background or with special 
parts of the field, and this is the first attempt that has been made at a systematic exposition of the 
statistical methodology of the subject. The book is a compendium of known procedures rather than 
a research monograph, and thus will be of great value to the practical worker although containing 
little that will be new to the mathematical statistician. 

Part I gives a non-technical introduction and has the great merit of containing a large number of 
numerical illustrations of the application of the methods to a wide variety of economic problems. This 
part of the book, with its lavish supply of references, furnishes us incidentally with a valuable historical 
survey. One confusing feature is the treatment of linear regression theory which is introduced in terms 
of the linear relation Y= a+ fX. Presumably one should regard this as a functional relation which 
holds exactly. However, we have next ‘the empirical regression equation’ Y’ = a+bX (1) which ‘will 
not hold exactly’. But we are not told what @ and 6 are and why they differ from a and #. They 
cannot be the sample estimators, since these are denoted later by a* and b* and the residuals Y — Y’ 
are assumed to be independent; on the other hand, they must be random variables since the residuals 
are random variables. It would surely have been much clearer to have written down the model 
explicitly as Y = «+ #X +e in the first place, any relevant assumptions being made about ¢. Once this 
hurdle is over, the exposition is sound and straightforward. 

Part II, dealing with multivariate analysis, takes the development into rather deeper levels of 
statistical methodology, and it is here that one begins to wonder how far the data can support the 
rather elaborate analyses that are made of them. Throughout the book the author is at some pains 
to point out the defects of his data, but this cannot be done in detail, and the mere fact that such 
analyses are made may beguile the uncritical. Much use is made here of models in which the relations 
are assumed to hold exactly between variables the observations of which are subject to error. Apart 
from obvious theoretical objections, the use of such models requires a knowledge of the variance matrix 
of the errors, and this will not be available in practice. The way round this difficulty suggested by the 
author, namely, to assume that the systematic parts of the variables have smooth time paths and 
to extract the error components by the variate difference method, does not seem very convincing. 

Part III gives a useful survey of the special problems that arise owing to the fact that most economic 
data occur in the form of time series. Particularly helpful is the stress laid on the analysis of regression 
data with serially correlated errors. J. DURBIN 


Quality Control and Industrial Statistics. By Acuzson J. Duncan. Chicago: 
Richard D. Irwin, Inc. 1952. xxvii+663 pp. Price $9. 


The book begins with a historical review of the industrial applications of statistics in U.S. and Great 
Britain and with some quotations from the Wall Street Journal and the New York Times to testify to 
the commercial value of the methods. Part I of the book then deals with the fundamental notions of 
probability, frequency distribution and sampling distribution. Part II is about single sample, double 
sample and sequential acceptance schemes for proportion defective, and Part III is on single sample 
schemes for continuous variables. Part IV describes the theory of control charts, and Part V deals with 
statistical methods likely to be useful in industrial research, including contingency tables, analysis of 
variance, regression and correlation and experimental design. Proofs that can be obtained by elementary 
algebra are contained in appendices, and there is a comprehensive set of statistical tables and glossaries 





a a te et ote! Oo 








Reviews 235 


of symbols and special technical terms. There are over 300 exercises for the reader, many of them 
containing sets of data from the American literature on quality control. The book as a whole is copiously 
illustrated with worked examples, tables, diagrams and nomograms. 

Part II on acceptance sampling for proportion defective suffers from the failure to emphasize the 
fundamental distinction between rectifying and non-rectifying inspection and from not stressing the 
role of cost in determining the amount and type of sampling. The result is a clear account of the main 
types of scheme but, apart from one short section, no indication is given of which type should be used 
in given circumstances. A whole chapter in this Part is devoted to a detailed description of the sampling 
schemes used by the U.S. Department of Defence. 

Part IV on control charts follows the usual American practice of setting control limits three standard 
deviations from the mean even when sampling binomial and Poisson distributions. An unusual feature 
of the treatment is that the operating characteristics of the charts are always given. The account of 
‘compressed-limit gauges’ (p. 296) is rather misleading, the important paper of W. L. Stevens (J. R. 
Statist. Soc. B, 10 (1948), 54) having been overlooked. The use of control charts to help to improve 
processes so that they become ‘in control’ is clearly explained but the inexperienced reader might get 
the erroneous idea that the state of ‘control’ is easily attained. 

Part V includes, besides the conventional statistical methods, some recent developments not normally 
appearing in text-books. Examples are the use of short-cut methods of analysis of variance based on 
range, J. W. Tukey’s gap and straggler tests for comparing individual means in analysis of variance, and 
some interesting work by F. E. Grubbs on comparing the precision of different observers or instruments 
from results arranged in a two-way table. 

A surprising omission is that there is no adequate account of how to obtain samples. There is a brief 
discussion of the general principles of sampling, but the example given is highly simplified. Some 
description of special methods of sampling powders, liquids, fibres, etc., would have been very valuable. 

According to the author’s preface, the book is intended both as a text-book for engineering and 
business students and as a reference book for quality control engineers and industrial research workers. 
Engineering and business students in this country, who receive at most a short course on statistics, are 
unlikely to find this very detailed and lengthy account of use. The excellent collection of exercises will, 
however, be very valuable to anyone planning a course of lectures on statistics for engineers, and the 
book can be strongly recommended on this account. Because it is reasonably comprehensive, the book 
can also be recommended to technologists using quality control and acceptance sampling. 

D. R. COX 


i 

} 

| 
H 
i 
H 
4 
| 
i 
i 


SS 













[ 236 ] 


CORRIGENDA 


(1) TABLES OF PERCENTAGE POINTS OF THE ‘STUDENTIZED’ RANGE 


(Biometrika (1952), 39, 192) 


I very much regret to have to report inaccuracies in the lines v = 5 and 7 of the revised tables of this 
function. These lines were obtained, at the last stage of the computational procedure, by interpolation 
between neighbouring values of v. Through an oversight on my part the convergence of the inter- 
polation formula was not checked, nor were the four percentage points of q for n = 2, v = 5, 7 compared 
with the corresponding percentage points of ,/2¢.* 


Table of the upper percentage points of the ‘Studentized’ range, q 














\ 
v 5% points | 1% points v 5% points 1% points 
n 5 7 5 7 n 5 7 5 7 

11 7-17 6-30 10-48 8-55 

2 3°64 3°34 5-70 4-95 12 7-32 6-43 10-70 8-71 

3 4-60 4-16 6-97 5-92 13 147 6-55 10-89 8-86 

4 5-22 4-68 7-80 6-54 14 7-60 6-66 11-08 9-00 

5 5-67 5-06 8-42 7-01 15 7°72 6-76 11-24 9-12 

6 6-03 5-36 8-91 7°37 16 7°83 6-85 11-40 9-24 

7 6-33 5-61 9-32 7-68 17 7-93 6-94 11-55 9-35 

8 6-58 5°82 9-67 7:94 18 8-03 7-02 11-68 9-46 

9 6-80 6-00 9-97 8-17 19 8-12 7:09 11-81 9-55 

10 6-99 6-16 10-24 8-37 20 8-21 717 11-93 9-65 






































The realization of this omission made it necessary to develop a completely different interpolation 
formula, and it was noticed that the ratio g is almost proportional to ./F (n—1, v), where F(n—1, v) 
is the corresponding percentage point of the variance ratio F' based on (i) n— 1 degrees of freedom for 
the numerator mean square, where n is the size of the sample with range w in q = w/s, and (ii) v degrees 
of freedom for the denominator mean square, which are also those of the independent value of s in 
q=w/s. 

In fact, for any given n, the ratio F(n—1, v)/q* is almost constant for all values of v and could be 
easily obtained by interpolation for the intermediate values of vy = 5 and 7 and converted to g with the 
help of the published tables of F(n—1, v) (Biometrika (1943), 33, 73). Further, this property made it 
possible to apply an additional check to all values of g in the tables. Only for the lines v = 5, 7 were 
discrepancies found exceeding 4 units in the last figure, and the corrected values for these lines only 
are therefore given above. Small end-figure adjustments of other values will, however, be made when 
the full corrected table is published as part of the forthcoming Biometrika Tables for Statisticians. 

H. O. HARTLEY 


(2) J. Draper, Biometrika (1952), 39, 299. 


In the last column of the table half-way down the page, for 2-753 read 2-750. 


* We are indebted to Mr George W. Thomson of the Ethyl Corporation, Michigan, for drawing our 
attention to the discrepancies between the values of g and /2¢ for y=5 and 7. 











aa 





(AU Rights reserved) 
BIOMETRIKA. Vol. 40, Parts 1 and 2 
CONTENTS 


The superposition of several strictly periodic sequences of events. By D. R. Cox and W. L. Surra 
Approximate confidence intervals. By M. S. Bartietr 


Incomplete and absolute moments of the multivariate normal distribution with some otal 
By A. R. Kamat 4 ° 


On the range of partial sums of a finite number of saikcapbiain normal variates. . By A. A. Anis and 

Note on ‘the Jacobians of certain matrix transformations useful in multivariate analysis’. By 
IncRaM OLKIN . ; ° ° 

Estimation of a functional ‘iia. By D. V. Linpiey 


Estimating parameters in truncated Pearson ett distributions without resort to higher 
moments. By A. C. CoHEN, JR. 


A problem of interference between two queues. By J. C. TANNER 
Tables of the angular transformation. By W. L. StEvEns 


Tests of significance in a 2x2‘contingency table: exteasion of Finney’ table. Computed by 
R. LatscHa 


A method for judging all contrasts in the ills of variance. By Henry ScHEFFE . 

The estimation and ¢omparison of strength of association in contingency tables. By A. Sruarr 
A sequential test-for randomness. By P. G. Moorr . A : . 

On the mean successive difference and its ratio to the root mean square. Pa A. R. Kamat 


The effect of unequal group variances on the F-test for the Beast of group means. By G. 
HorsNELL 


The estimation of pata parameters from data obtained by means of the capture-recapture 
method. III. An example of the practical ca of the method. ad P. H. Lesiiz, Dennis 
Carrry and HELEN CurtTry : ¥ 

On the utilization of marked specimens in estimating spicclailas of t fying insects. By C C. C. Crara 

The total size of a general stochastic epidemic. By Norman T. J. Bartzy 


Experimental evidence concerning contagious distributions in ecology. By D. A. Evans . 
MISCELLANEA 


Time intervals between accidents—a note on TT Pearson and bess: paper. ae G. A. 
BaRN4éRD 


Further notes on the asus of accident data. By B.A. bichese E. 8. Pearson and A. H. A. 

On a method of estimating biological populations in the field. By C. C. Craic : . 

A rapid method for estimating the correlation coefficient from the range of the deviations about 
the reduced major axis. By C. H. Lzeicno-DucMmore : 

The effect of vverlapping in bacterial counts of incubated colonies. By C C. Mack 

Non-normality in two-sample t-tests. By D. G. C. Groxow . 

Note on the Poisson Index of Dispersion. By N. KaTHrrGAMATAMBY . 

On an extension of Geary’s theorem. By R. G. Lana 


The Doolittle method and the fitting of polynomials to suet a data. By P. G. Guzstr 


A simple method of deriving best critical regions similar to the sample space in tests of an im- 
portent class of composite hypotheses. By K. 8. Rao 


REvIEws 
GERHARD TINTNER’s ‘Econometrics’ . ; t ‘ J : 
AcuEson J. Duncan’s ‘Quality Control and Industrial Statistics’ 


CoRRIGENDA 


First printed in Great Britain at the University Press, Cambridge 
Reprinted by offset-litho by Percy Lund Humphries & Co., Lid. 








